When vfs objects = full_audit configured not per-service but in [global] section, copying files between Windows XP and Samba sporadically stops with error "The specified network name is no longer available". In log: smbd/service.c:close_cnum(1323) 10.6.7.2 (10.6.7.2) closed connection to service IPC$ lib/fault.c:fault_report(40) =============================================================== lib/fault.c:fault_report(41) INTERNAL ERROR: Signal 11 in pid 22842 (3.3.4) Please read the Trouble-Shooting section of the Samba3-HOWTO In gdb: #0 0x0000000801ce8e8a in wait4 () from /lib/libc.so.7 #1 0x0000000801caeffe in system () from /lib/libc.so.7 #2 0x000000000069fe98 in smb_panic (why=Variable "why" is not available.) at lib/util.c:1679 #3 0x000000000068a838 in sig_fault (sig=Variable "sig" is not available.) at lib/fault.c:46 #4 <signal handler called> #5 0x000000080234eb84 in do_log (op=SMB_VFS_OP_DISCONNECT, success=Variable "success" is not available.) at modules/vfs_full_audit.c:712 #6 0x000000080235166b in smb_full_audit_disconnect (handle=0x802029050) at modules/vfs_full_audit.c:915 #7 0x00000000004e6154 in close_cnum (conn=0x80207e050, vuid=101) at smbd/service.c:1326 #8 0x00000000004a46f2 in reply_tdis (req=0x802022130) at smbd/reply.c:4605 #9 0x00000000004e3a30 in switch_message (type=113 'q', req=0x802022130, size=Variable "size" is not available.) at smbd/process.c:1486 #10 0x00000000004e5ea7 in smbd_process () at smbd/process.c:1509 #11 0x00000000008c22f4 in main (argc=-1, argv=0x7fffffffd278) at smbd/server.c:1519 Event processing in vfs_full_audit.c: smb_full_audit_disconnect() => do_log() => audit_prefix() => talloc_sub_advanced(..., conn->server_info->unix_name, ...) BUT: conn->server_info == NULL in case of IPC$ I think it's a real reason of SIGSEGV.
Ok, I think I understand the problem here. Give me a little while to prepare a fix (and a torture test for this). Jeremy.
Created attachment 4099 [details] Patch for 3.3.4. Ok, here is the (untested) code I think will fix this bug. I will write a torture test to confirm my understanding of the problem. Please apply and test and let me know. Thanks, Jeremy.
Created attachment 4100 [details] Better patch for 3.3.4 and above. Ok, here is the patch I'd like to use in released code. The underlying problem is that once SMBulogoff is called, all server_info contexts associated with the vuid should become invalid, even if that's the context being currently used by the connection struct (tid). When the SMBtdis comes in it doesn't need a valid vuid value, but the code called inside vfs_full_audit always assumes that there is one (and hence a valid conn->server_info pointer) available. This is actually a bug inside the vfs_full_audit and other code inside Samba, which should only indirect conn->server_info on calls which require AS_USER to be set in our process table. I could fix all these issues, but there's no guarentee that someone might not add more code that fails this assumption, as it's a hard assumption to break (it's usually true). So what I've done is to ensure that on SMBulogoff the previously used conn->server_info struct is kept around to be used for print debugging purposes (it won't be used to change to an invalid user context, as such calls need AS_USER set). This isn't strictly correct, as there's no association with the (now invalid) context being freed and the call that causes conn->server_info to be indirected, but it's good enough for most cases. The hard part was to ensure that once a valid context is used again (via new sessionsetupX calls, or new calls on a still valid vuid on this tid) that we don't leak memory by simply replacing the stored conn->server_info pointer. We would never actually leak the memory (as all conn->server_info pointers are talloc children of conn), but with the previous patch a malicious client could cause many server_info structs to be talloced by the right combination of SMB calls. This new patch introduces free_conn_server_info_if_unused(), which protects against the above. Please test and report back to me. Thanks, Jeremy.
Thank you, Jeremy. I had patched uid.c and problem disappeared. vfs_full_audit now logs "IPC_|disconnect|ok|IPC$" string instead of crashing smbd.
Ok, just to confirm - you used the attachment titled "Better patch for 3.3.4 and above" - correct ? Thanks very much for testing. Jeremy.
Yes, I used exactly "Better patch for 3.3.4 and above" with the same configuration. No more file transfer breaks and messages like "kernel: pid 22427 (smbd), uid 11290: exited on signal 11" appear.
Please go ahead and put this into 3.3.5. Volker
Patch is upstream. Will be included in 3.3.5. Closing out bug report. Thanks for reporting!