Created attachment 7378 [details] Proposed patch to fix the crash seen dcerpc_lookup_sids_noalloc() when there are more than 1000 sid entries The dcerpc_lsa_lookup_sids() is called in the path _wbint_LookupGroupMembers() to get the sid to name conversion. This is done with lsa call. When the members in the group are more than 1000, we get the following stack dump in the log.wb-<DOMAIN> log file : #0 /usr/sbin/winbindd(log_stack_trace+0x1a) [0x7f5a8737534a] #1 /usr/sbin/winbindd(smb_panic+0x2b) [0x7f5a8737541b] #2 /usr/lib64/libtalloc.so.2(talloc_strdup+0x2ba) [0x7f5a84dacdba] #3 /usr/sbin/winbindd(+0x501e01) [0x7f5a876b9e01] #4 /usr/sbin/winbindd(dcerpc_lsa_lookup_sids+0x2e) [0x7f5a876ba42e] #5 /usr/sbin/winbindd(winbindd_lookup_sids+0xf5) [0x7f5a872b4555] #6 /usr/sbin/winbindd(+0xffb62) [0x7f5a872b7b62] #7 /usr/sbin/winbindd(+0xec400) [0x7f5a872a4400] #8 /usr/sbin/winbindd(_wbint_LookupGroupMembers+0x63) [0x7f5a872c1f63] #9 /usr/sbin/winbindd(+0x1137d6) [0x7f5a872cb7d6] #10 /usr/sbin/winbindd(winbindd_dual_ndrcmd+0xe1) [0x7f5a872c0ca1] #11 /usr/sbin/winbindd(+0x107a02) [0x7f5a872bfa02] #12 /usr/sbin/winbindd(+0x108525) [0x7f5a872c0525] #13 /usr/sbin/winbindd(tevent_common_loop_immediate+0xea) [0x7f5a873860fa] #14 /usr/sbin/winbindd(run_events_poll+0x3b) [0x7f5a8738435b] #15 /usr/sbin/winbindd(+0x1cca92) [0x7f5a87384a92] #16 /usr/sbin/winbindd(_tevent_loop_once+0x90) [0x7f5a87384e90] #17 /usr/sbin/winbindd(main+0x7cb) [0x7f5a8729831b] #18 /lib64/libc.so.6(__libc_start_main+0xe6) [0x7f5a8441cc36] #19 /usr/sbin/winbindd(+0xdda89) [0x7f5a87295a89] [2012/03/12 00:05:33.619990, 0] lib/fault.c:372(dump_core) The gdb on the core gives a message that it is because of memory corruption. After analyzing the code I found there is a pointer mishandling in dcerpc_lsa_lookup_sids_noalloc(). The "names" and "domains" char pointer array are allocated in dcerpc_lsa_lookup_sids_generic() and are passed to dcerpc_lsa_lookup_noalloc() with the hunk size of 1000. The base pointer passed to the function is used as context in talloc_strdup(). The base pointer works fine when we come for 0-1000 hunk. But for the next iteration, the "names" passed is hunk_names which is 1000 offset from the base. When this is used to do talloc_strdup(), the talloc rightly aborts the attempt. As this is not the context of the allocated array. I have tried a change by always using the base allocated pointer and it works. Attaching the fix as an attachment. Please review it.
Comment on attachment 7378 [details] Proposed patch to fix the crash seen dcerpc_lookup_sids_noalloc() when there are more than 1000 sid entries marking this as patch so it becomes viewable
Created attachment 7380 [details] Patch for master/3.6.x. How about this version instead ? I really hate passing in an int and then doing pointer arithmetic on it, as it isn't clear what it's being used for initially. This version passes in the base allocations directly, to be used if we have to do a talloc_strdup. Christian, can you let me know what you think ? Jeremy.
Created attachment 7381 [details] Patch for master/3.6.x. Arg. Better version that moves the new context arguments so they can't get mixed up with regular pointer ones. Jeremy.
Comment on attachment 7381 [details] Patch for master/3.6.x. It fixes the problem on my test setup and is much easier to understand than the originally proposed patch that used pointer arithmetics
Looked into the history of the code and maybe we should just revert 30d25210 that made the code use the wrong contexts for talloc_strdup? There is a mem_ctx available that gets passed to dcerpc_lsa_lookup_sids_generic() and then to dcerpc_lsa_lookup_sids_noalloc()
(In reply to comment #5) > Looked into the history of the code and maybe we should just revert 30d25210 > that made the code use the wrong contexts for talloc_strdup? > There is a mem_ctx available that gets passed to > dcerpc_lsa_lookup_sids_generic() and then to dcerpc_lsa_lookup_sids_noalloc() But then the talloc hierachie is not correct, I'd go for Jeremy's patch.
Re-assigning to Karolin for inclusion in 3.6.next. Jeremy.
Karolin, you can cherry-pick 7936fb0
Pushed to v3-6-test. Closing out bug report. Thanks!
*** Bug 8871 has been marked as a duplicate of this bug. ***