Bug 6315 - smbd crashes doing vfs_full_audit on IPC$ close event.
smbd crashes doing vfs_full_audit on IPC$ close event.
Product: Samba 3.3
Classification: Unclassified
Component: VFS Modules
x64 FreeBSD
: P3 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
Depends on:
  Show dependency treegraph
Reported: 2009-05-01 18:13 UTC by Sergey Korsak
Modified: 2009-05-07 01:49 UTC (History)
0 users

See Also:

Patch for 3.3.4. (1.42 KB, patch)
2009-05-01 18:58 UTC, Jeremy Allison
no flags Details
Better patch for 3.3.4 and above. (3.09 KB, patch)
2009-05-01 22:34 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Korsak 2009-05-01 18:13:10 UTC
When vfs objects = full_audit configured not per-service but in [global] section,
copying files between Windows XP and Samba sporadically stops with error "The specified network name is no longer available".

In log:
smbd/service.c:close_cnum(1323) ( closed connection to service IPC$
  INTERNAL ERROR: Signal 11 in pid 22842 (3.3.4)
  Please read the Trouble-Shooting section of the Samba3-HOWTO

In gdb:
#0  0x0000000801ce8e8a in wait4 () from /lib/libc.so.7
#1  0x0000000801caeffe in system () from /lib/libc.so.7
#2  0x000000000069fe98 in smb_panic (why=Variable "why" is not available.) at lib/util.c:1679
#3  0x000000000068a838 in sig_fault (sig=Variable "sig" is not available.) at lib/fault.c:46
#4  <signal handler called>
#5  0x000000080234eb84 in do_log (op=SMB_VFS_OP_DISCONNECT, success=Variable "success" is not available.) at modules/vfs_full_audit.c:712
#6  0x000000080235166b in smb_full_audit_disconnect (handle=0x802029050) at modules/vfs_full_audit.c:915
#7  0x00000000004e6154 in close_cnum (conn=0x80207e050, vuid=101) at smbd/service.c:1326
#8  0x00000000004a46f2 in reply_tdis (req=0x802022130) at smbd/reply.c:4605
#9  0x00000000004e3a30 in switch_message (type=113 'q', req=0x802022130, size=Variable "size" is not available.) at smbd/process.c:1486
#10 0x00000000004e5ea7 in smbd_process () at smbd/process.c:1509
#11 0x00000000008c22f4 in main (argc=-1, argv=0x7fffffffd278) at smbd/server.c:1519

Event processing in vfs_full_audit.c:
smb_full_audit_disconnect() => do_log() => audit_prefix() =>
talloc_sub_advanced(..., conn->server_info->unix_name, ...)
BUT: conn->server_info == NULL in case of IPC$
I think it's a real reason of SIGSEGV.
Comment 1 Jeremy Allison 2009-05-01 18:45:26 UTC
Ok, I think I understand the problem here. Give me a little while to prepare a fix (and a torture test for this).

Comment 2 Jeremy Allison 2009-05-01 18:58:31 UTC
Created attachment 4099 [details]
Patch for 3.3.4.

Ok, here is the (untested) code I think will fix this bug. I will write a torture test to confirm my understanding of the problem. Please apply and test and let me know.
Comment 3 Jeremy Allison 2009-05-01 22:34:40 UTC
Created attachment 4100 [details]
Better patch for 3.3.4 and above.

Ok, here is the patch I'd like to use in released code. The underlying problem is that once SMBulogoff is called, all server_info contexts associated with the vuid should become invalid, even if that's the context being currently used by the connection struct (tid). When the SMBtdis comes in it doesn't need a valid vuid value, but the code called inside vfs_full_audit always assumes that there is one (and hence a valid conn->server_info pointer) available.

This is actually a bug inside the vfs_full_audit and other code inside Samba, which should only indirect conn->server_info on calls which require AS_USER to be set in our process table. I could fix all these issues, but there's no guarentee that someone might not add more code that fails this assumption, as it's a hard assumption to break (it's usually true).

So what I've done is to ensure that on SMBulogoff the previously used conn->server_info struct is kept around to be used for print debugging purposes (it won't be used to change to an invalid user context, as such calls need AS_USER set). This isn't strictly correct, as there's no association with the (now invalid) context being freed and the call that causes conn->server_info to be indirected, but it's good enough for most cases.

The hard part was to ensure that once a valid context is used again (via new sessionsetupX calls, or new calls on a still valid vuid on this tid) that we don't leak memory by simply replacing the stored conn->server_info pointer. We would never actually leak the memory (as all conn->server_info pointers are talloc children of conn), but with the previous patch a malicious client could cause many server_info structs to be talloced by the right combination of SMB calls. This new patch introduces free_conn_server_info_if_unused(), which protects against the above.

Please test and report back to me. Thanks,

Comment 4 Sergey Korsak 2009-05-02 17:42:03 UTC
Thank you, Jeremy. I had patched uid.c and problem disappeared.
vfs_full_audit now logs "IPC_|disconnect|ok|IPC$" string instead
of crashing smbd.
Comment 5 Jeremy Allison 2009-05-02 21:14:41 UTC
Ok, just to confirm - you used the attachment titled "Better patch for 3.3.4 and above" - correct ? Thanks very much for testing.
Comment 6 Sergey Korsak 2009-05-03 03:24:32 UTC
Yes, I used exactly "Better patch for 3.3.4 and above" with the same configuration. No more file transfer breaks and messages like "kernel: pid 22427 (smbd), uid 11290: exited on signal 11" appear.
Comment 7 Volker Lendecke 2009-05-05 09:41:19 UTC
Please go ahead and put this into 3.3.5.

Comment 8 Karolin Seeger 2009-05-07 01:49:51 UTC
Patch is upstream. Will be included in 3.3.5.
Closing out bug report.

Thanks for reporting!