Bug 8807 - dcerpc_lsa_lookup_sids_noalloc() crashes when groups has more than 1000 groups
Summary: dcerpc_lsa_lookup_sids_noalloc() crashes when groups has more than 1000 groups
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 3.6.3
Hardware: All All
: P5 major
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
: 8871 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-03-12 12:11 UTC by hargagan
Modified: 2012-04-23 20:05 UTC (History)
3 users (show)

See Also:


Attachments
Proposed patch to fix the crash seen dcerpc_lookup_sids_noalloc() when there are more than 1000 sid entries (2.40 KB, patch)
2012-03-12 12:11 UTC, hargagan
no flags Details
Patch for master/3.6.x. (1.91 KB, patch)
2012-03-13 00:01 UTC, Jeremy Allison
no flags Details
Patch for master/3.6.x. (1.87 KB, patch)
2012-03-13 00:04 UTC, Jeremy Allison
ambi: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description hargagan 2012-03-12 12:11:24 UTC
Created attachment 7378 [details]
Proposed patch to fix the crash seen dcerpc_lookup_sids_noalloc() when there are more than 1000 sid entries

The dcerpc_lsa_lookup_sids() is called in the path _wbint_LookupGroupMembers() to get the sid to name conversion. This is done with lsa call.

When the members in the group are more than 1000, we get the following stack dump in the log.wb-<DOMAIN> log file :
 #0 /usr/sbin/winbindd(log_stack_trace+0x1a) [0x7f5a8737534a]
   #1 /usr/sbin/winbindd(smb_panic+0x2b) [0x7f5a8737541b]
   #2 /usr/lib64/libtalloc.so.2(talloc_strdup+0x2ba) [0x7f5a84dacdba]
   #3 /usr/sbin/winbindd(+0x501e01) [0x7f5a876b9e01]
   #4 /usr/sbin/winbindd(dcerpc_lsa_lookup_sids+0x2e) [0x7f5a876ba42e]
   #5 /usr/sbin/winbindd(winbindd_lookup_sids+0xf5) [0x7f5a872b4555]
   #6 /usr/sbin/winbindd(+0xffb62) [0x7f5a872b7b62]
   #7 /usr/sbin/winbindd(+0xec400) [0x7f5a872a4400]
   #8 /usr/sbin/winbindd(_wbint_LookupGroupMembers+0x63) [0x7f5a872c1f63]
   #9 /usr/sbin/winbindd(+0x1137d6) [0x7f5a872cb7d6]
   #10 /usr/sbin/winbindd(winbindd_dual_ndrcmd+0xe1) [0x7f5a872c0ca1]
   #11 /usr/sbin/winbindd(+0x107a02) [0x7f5a872bfa02]
   #12 /usr/sbin/winbindd(+0x108525) [0x7f5a872c0525]
   #13 /usr/sbin/winbindd(tevent_common_loop_immediate+0xea) [0x7f5a873860fa]
   #14 /usr/sbin/winbindd(run_events_poll+0x3b) [0x7f5a8738435b]
   #15 /usr/sbin/winbindd(+0x1cca92) [0x7f5a87384a92]
   #16 /usr/sbin/winbindd(_tevent_loop_once+0x90) [0x7f5a87384e90]
   #17 /usr/sbin/winbindd(main+0x7cb) [0x7f5a8729831b]
   #18 /lib64/libc.so.6(__libc_start_main+0xe6) [0x7f5a8441cc36]
   #19 /usr/sbin/winbindd(+0xdda89) [0x7f5a87295a89]
[2012/03/12 00:05:33.619990,  0] lib/fault.c:372(dump_core)

The gdb on the core gives a message that it is because of memory corruption.

After analyzing the code I found there is a pointer mishandling in dcerpc_lsa_lookup_sids_noalloc(). The "names" and "domains" char pointer array are allocated in dcerpc_lsa_lookup_sids_generic() and are passed to dcerpc_lsa_lookup_noalloc() with the hunk size of 1000. The base pointer passed to the function is used as context in talloc_strdup(). 

The base pointer works fine when we come for 0-1000 hunk. But for the next iteration, the "names" passed is hunk_names which is 1000 offset from the base. When this is used to do talloc_strdup(), the talloc rightly aborts the attempt. As this is not the context of the allocated array.

I have tried a change by always using the base allocated pointer and it works. Attaching the fix as an attachment. Please review it.
Comment 1 Christian Ambach 2012-03-12 12:28:31 UTC
Comment on attachment 7378 [details]
Proposed patch to fix the crash seen dcerpc_lookup_sids_noalloc() when there are more than 1000 sid entries

marking this as patch so it becomes viewable
Comment 2 Jeremy Allison 2012-03-13 00:01:27 UTC
Created attachment 7380 [details]
Patch for master/3.6.x.

How about this version instead ? I really hate passing in an int and then doing pointer arithmetic on it, as it isn't clear what it's being used for initially.

This version passes in the base allocations directly, to be used if we have to do a talloc_strdup.

Christian, can you let me know what you think ?

Jeremy.
Comment 3 Jeremy Allison 2012-03-13 00:04:54 UTC
Created attachment 7381 [details]
Patch for master/3.6.x.

Arg. Better version that moves the new context arguments so they can't get mixed up with regular pointer ones.

Jeremy.
Comment 4 Christian Ambach 2012-03-13 12:36:21 UTC
Comment on attachment 7381 [details]
Patch for master/3.6.x.

It fixes the problem on my test setup and is much easier to understand than the originally proposed patch that used pointer arithmetics
Comment 5 Christian Ambach 2012-03-13 13:35:31 UTC
Looked into the history of the code and maybe we should just revert 30d25210 that made the code use the wrong contexts for talloc_strdup?
There is a mem_ctx available that gets passed to dcerpc_lsa_lookup_sids_generic() and then to dcerpc_lsa_lookup_sids_noalloc()
Comment 6 Stefan Metzmacher 2012-03-13 13:49:44 UTC
(In reply to comment #5)
> Looked into the history of the code and maybe we should just revert 30d25210
> that made the code use the wrong contexts for talloc_strdup?
> There is a mem_ctx available that gets passed to
> dcerpc_lsa_lookup_sids_generic() and then to dcerpc_lsa_lookup_sids_noalloc()

But then the talloc hierachie is not correct, I'd go for Jeremy's patch.
Comment 7 Jeremy Allison 2012-03-13 21:42:47 UTC
Re-assigning to Karolin for inclusion in 3.6.next.
Jeremy.
Comment 8 Christian Ambach 2012-03-14 08:55:26 UTC
Karolin,

you can cherry-pick 7936fb0
Comment 9 Karolin Seeger 2012-03-19 13:35:21 UTC
Pushed to v3-6-test.
Closing out bug report.

Thanks!
Comment 10 Volker Lendecke 2012-04-23 20:05:09 UTC
*** Bug 8871 has been marked as a duplicate of this bug. ***