I came across this issue in a setup where the communication is happening b/w libsmbclient and AD server. The crash is seen during SMBC_getxattr() call. It has a SMBC_server() call which gives the server (srv). This is found after checking the connection status using 'SMBC_find_server()', if it is taking a cached server connection. Then there is a SMBC_attr_server() to connect to a attribute server. This also does SMBC_server() on the same set of server/share/workgroup etc. Now the problem comes when the second SMBC_server() fails to find the cached connection as active. In that case the cached information is freed during SMBC_find_server() lookup. This causes the first value of 'srv' obtained directly inside SMBC_getxattr_ctx() to go as invalid. This is causing invalid memory read and hence a crash. The valgrind logs snippet for the same : ==28275== Invalid read of size 8 ==28275== at 0x10B81D6E: cacl_get (libsmb_xattr.c:750) ==28275== by 0x10B83EB8: SMBC_getxattr_ctx (libsmb_xattr.c:2175) ==28275== by 0x1095F94C: getattr(int, int&) (in /tmp/mytestprog) ==28275== by 0x64FA64C: clone (in /lib64/libc-2.11.3.so) ==28275== Address 0x82957ac0 is 0 bytes inside a block of size 56 free'd ==28275== at 0x4C29138: free (vg_replace_malloc.c:473) ==28275== by 0x10B7C631: SMBC_remove_unused_server (libsmb_server.c:118) ==28275== by 0x10B7C7CE: SMBC_find_server (libsmb_server.c:227) ==28275== by 0x10B7C91D: SMBC_server_internal (libsmb_server.c:297) ==28275== by 0x10B7D695: SMBC_server (libsmb_server.c:692) ==28275== by 0x10B7D897: SMBC_attr_server (libsmb_server.c:754) ==28275== by 0x10B841C0: SMBC_getxattr_ctx (libsmb_xattr.c:2122) ==28275== by 0x1095F94C: getattr(int, int&) (in /tmp/mytestprog) ==28275== by 0x64FA64C: clone (in /lib64/libc-2.11.3.so) Here is the stack trace : ************************************************************************************************************************************************************************************************** gdb) bt #0 smbXcli_conn_protocol (conn=0x0) at ../libcli/smb/smbXcli_base.c:418 #1 0x00007f4aafb3476e in cli_resolve_path (ctx=0x7f4a2c433950, mountpt=0x7f4ab05d598b "", dfs_auth_info=0x7f4a2c03ceb0, rootcli=0x1143660, path=0x7f4a2c447670 "\\net1\\file1.txt", targetcli=0x7f4aa46af620, pp_targetpath=0x7f4aa46af628) at ../source3/libsmb/clidfs.c:1235 #2 0x00007f4ab05d04f3 in cacl_get (context=0x69a910, ctx=0x7f4a2c433950, srv=0x6a2510, ipc_cli=0x7f4a809f77d0, pol=0x7f4a8094a2f4, filename=0x7f4a2c447670 "\\net1\\file1.txt", attr_name=0x7f4ab07f314d "system.nt_sec_desc.*", buf=0x7f4a2c458100 "", bufsize=466) at ../source3/libsmb/libsmb_xattr.c:894 #3 0x00007f4ab05d2292 in SMBC_getxattr_ctx (context=0x69a910, fname=<optimized out>, name=0x7f4ab07f314d "system.nt_sec_desc.*", value=0x7f4a2c458100, size=466) at ../source3/libsmb/libsmb_xattr.c:2184 #4 0x00007f4ab07f094d in getattr(int, int&) #5 0x00007f4aba08c64d in clone () from /lib64/libc.so.6 #6 0x0000000000000000 in ?? () (gdb) f 2 #2 0x00007f4ab05d04f3 in cacl_get (context=0x69a910, ctx=0x7f4a2c433950, srv=0x6a2510, ipc_cli=0x7f4a809f77d0, pol=0x7f4a8094a2f4, filename=0x7f4a2c447670 "\\net1\\file1.txt", attr_name=0x7f4ab07f314d "system.nt_sec_desc.*", buf=0x7f4a2c458100 "", bufsize=466) at ../source3/libsmb/libsmb_xattr.c:894 894 ../source3/libsmb/libsmb_xattr.c: No such file or directory. (gdb) p *srv $1 = {cli = 0x1143660, dev = 664385006, no_pathinfo = false, no_pathinfo2 = false, no_pathinfo3 = false, no_nt_session = false, pol = {handle_type = 0, uuid = {time_low = 0, time_mid = 0, time_hi_and_version = 0, clock_seq = "\000", node = "\000\000\000\000\000"}}, next = 0x0, prev = 0x0} (gdb) f 1 #1 0x00007f4aafb3476e in cli_resolve_path (ctx=0x7f4a2c433950, mountpt=0x7f4ab05d598b "", dfs_auth_info=0x7f4a2c03ceb0, rootcli=0x1143660, path=0x7f4a2c447670 "\\net1\\file1.txt", targetcli=0x7f4aa46af620, pp_targetpath=0x7f4aa46af628) at ../source3/libsmb/clidfs.c:1235 1235 ../source3/libsmb/clidfs.c: No such file or directory. (gdb) p *rootcli $2 = {prev = 0x7f4ab039c920 <samba_tevent_debug>, next = 0x41, rap_error = 0, raw_status = {v = 0}, map_dos_errors = 120, domain = 0x1143678 "x6\024\001", user_name = 0x1149910 "\330b2\272J\177", password = 0x0, server_type = 0x760030362f8e2924 <Address 0x760030362f8e2924 out of bounds>, server_os = 0x40 <Address 0x40 out of bounds>, server_domain = 0xc1 <Address 0xc1 out of bounds>, share = 0x7f4aba325f88 <main_arena+264> "x_2\272J\177", dev = 0x7f4aba325f88 <main_arena+264> "x_2\272J\177", timeout = 9286944, initialised = 0, win95 = 0, is_guestlogin = false, server_posix_capabilities = 0, requested_posix_capabilities = 0, backup_intent = false, pipe_list = 0x7f4aae4c60f8, use_kerberos = 80, fallback_after_kerberos = false, use_ccache = false, pw_nt_hash = false, got_kerberos_mechanism = false, use_oplocks = false, dfs_mountpoint = 0xe8150c73 <Address 0xe8150c73 out of bounds>, conn = 0x0, remote_realm = 0x0, smb1 = {pid = 0, vc_num = 0, session = 0x7f4aada67d20 <std_event_context_init>, tcon = 0x7f4aada69b40 <epoll_event_add_fd>}, smb2 = {session = 0x7f4aada64790 <tevent_common_fd_set_close_fn>, tcon = 0x7f4aada64770 <tevent_common_fd_get_flags>, open_handles = 0x7f4aada696c0 <epoll_event_set_fd_flags>}} **************************************************************************************************************************************************************************************************
Created attachment 10924 [details] Test patch for master Can you test the attached patch please ? Thanks ! Jeremy.
Thanks Jeremy for the quick patch. I will test this and get back to you on this shortly.
Apologies for the delay in response, as I got held up with some issues. I am currently testing the change, I will get back on that once done.
I ran the fix for almost 20+hr and there is no crash reported. I even ran the fix with valgrind in the same environment where the crash was happening every hour or so. With the fix the invalid read is not reported any more in valgrind reports. Great fix, thanks !! The changes looks good but I was thinking of a scenario which could cause the crash / invlaid read again. In the patch, smbc_getFunctionGetCachedServer() is called from inside the check - "if (! ipc_srv)". Incase the ipc_srv is not allocated inside SMBC_attr_server() but the "srv" pointer was reallocated inside that, the change will still read the freed 'srv' pointer inside cacl_get() which is called later in smbc_getxattr_ctx(). Though this is a corner case, is it okay to assume this and move this code outside the whole if-else block dealing with ipc_srv ?
Created attachment 10932 [details] New test patch for master How about this instead ? Can you test and get back to me please ?
(In reply to Jeremy Allison from comment #5) Thanks for the changes. I tested the changes and those are also working fine since 20+hrs. Thanks again !!
Jeremy, As per the last update from my side , the changes have fixed the issue. Will this be the final change ?
Created attachment 10953 [details] git-am cherry-pick from master for 4.2.next, 4.1.next. Patch that went into master. Applies cleanly to 4.2.next, 4.1.next.
@Karolin, please apply to the maintenance branches.
(In reply to David Disseldorp from comment #9) Pushed to autobuild-v4-[1|2]-test.
(In reply to Karolin Seeger from comment #10) Pushed to v4-2-test. autobuild-v4-1-test failed -> re-trying
(In reply to Karolin Seeger from comment #11) Pushed to v4-1-test. Closing out bug report. Thanks!