Symptoms: when trying to ssh in (PAM), winbindd cores. May core at other times, haven't checked. OS: CentOS 5.3 x86_64 Details in attached log file.
Created attachment 4821 [details] Details of compile options, smb.conf, et cetera.
Can you compile with -g and run winbind under valgrind? Thanks, Volker
Volker, I've recompiled the whole 3.4.1 tree with "--enable-developer" tacked onto ./configure's list of options, then fed winbindd to valgrind. Running winbindd under valgrind and SSHing in actually let me in (logged in successfully) and I did not notice winbindd dying. Re-ran winbindd in stand-alone mode, and the crash was reproduced. valgrind's output is attached.
Created attachment 4825 [details] WinbindD under valgrind.
Created attachment 4829 [details] Patch for 3.4 Can you try the attached patch? Thanks, Volker
Volker, Thanks for the patch. Tried it. WinbindD stays without crashing when I try to SSH in. When I start smbd/nmbd and try to access a share, WinbindD does crash, though: =============================================================== INTERNAL ERROR: Signal 6 in pid 5731 (3.4.1) Please read the Trouble-Shooting section of the Samba3-HOWTO From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf =============================================================== smb_panic: clobber_region() last called from [sid_to_fstring(178)] PANIC (pid 5731): internal error BACKTRACE: 35 stack frames: #0 winbindd(log_stack_trace+0x1c) [0x7f031e5242de] #1 winbindd(smb_panic+0x153) [0x7f031e5240b9] #2 winbindd [0x7f031e50d7ac] #3 winbindd [0x7f031e50d7bf] #4 /lib64/libc.so.6 [0x7f031c0a8280] #5 /lib64/libc.so.6(gsignal+0x35) [0x7f031c0a8215] #6 /lib64/libc.so.6(abort+0x110) [0x7f031c0a9cc0] #7 /usr/lib64/libtalloc.so.1 [0x7f031ca0186e] #8 /usr/lib64/libtalloc.so.1 [0x7f031ca0188d] #9 /usr/lib64/libtalloc.so.1 [0x7f031ca0198a] #10 /usr/lib64/libtalloc.so.1 [0x7f031ca02587] #11 /usr/lib64/libtalloc.so.1(talloc_free+0x15) [0x7f031ca031d2] #12 /usr/lib64/nss_info/adex.so [0x7f03172396dc] #13 /usr/lib64/nss_info/adex.so [0x7f0317239b11] #14 /usr/lib64/nss_info/adex.so [0x7f0317239ee5] #15 /usr/lib64/nss_info/adex.so [0x7f031723b6c4] #16 /usr/lib64/nss_info/adex.so [0x7f0317234f25] #17 winbindd(idmap_backends_sid_to_unixid+0x172) [0x7f031e99fb7a] #18 winbindd(idmap_sid_to_gid+0x3fd) [0x7f031e9a1509] #19 winbindd(winbindd_dual_sid2gid+0x1b5) [0x7f031e4764ae] #20 winbindd [0x7f031e468785] #21 winbindd [0x7f031e46c34e] #22 winbindd [0x7f031e468259] #23 winbindd(async_request+0x348) [0x7f031e4679ee] #24 winbindd(do_async+0x183) [0x7f031e46c69c] #25 winbindd(winbindd_sid2uid_async+0x25e) [0x7f031e475d08] #26 winbindd [0x7f031e427a79] #27 winbindd [0x7f031e4702f4] #28 winbindd [0x7f031e46c517] #29 winbindd [0x7f031e4681f5] #30 winbindd [0x7f031e424492] #31 winbindd [0x7f031e425a09] #32 winbindd(main+0xde7) [0x7f031e426851] #33 /lib64/libc.so.6(__libc_start_main+0xf4) [0x7f031c095974] #34 winbindd [0x7f031e422ba9] smb_panic(): calling panic action [/bin/sleep 999999999] [ 5905]: request interface version [ 5905]: request location of privileged pipe [ 5905]: getpwnam pmay [ 5727]: lookupname DELACY\pmay [ 5727]: lookupsid S-1-5-21-79843086-108998794-1039276024-4393 [ 5908]: request interface version [ 5908]: request location of privileged pipe final write to client failed: Broken pipe
Additionally, SSH/PAM auth fails.
I'm tempted to say that you should contact Likewise Software about that bug, it's their code. But as they have stopped supporting Samba, I guess it is now upon the Samba Team to clean up what's there. Again -- does valgrind show anything significant? Alternatively, can you please send a full debug level 10 log up to that crash? Thanks, Volker
I strongly suspect that Likewise Software would come back with something like: 1) Sure. The solution is "Likewise Open (tm)(r)(c)(q)(z)(v)(f)(p)" -- I'll re-run the new binary under valgrind, and will post debuglevel 10 logs. Previous valgrind attempt suggests this to be a heisenbug, but my valgrind-fu is weak.
Ok. Attached two files, as requested.
Created attachment 4835 [details] valgrind output of winbindd -d 10
Created attachment 4836 [details] winbindd -d 10, under valgrind.
The behavior of winbindd is still: 1) PAM Auth succeeds (smbd/nmbd off) 2) Accessing a share fails with winbindd crashing (smbd/nmbd on)
Created attachment 4837 [details] patch Can you try the attached patch? Thanks, Volker
+1 - this is obviously correct. The entry_dn variable is being freed twice, once when frame is deleted, and then again. The "talloc_tos()" reference at line 371 should be renamed to "frame" as well to make this clearer (IMHO). I think we need this for 3.4.3. Jeremy.
Karo, I think both patches are required for 3.4.3. The second one has formally been reviewed, the first one not yet, but the reporter got much further with that patch applied. Volker
Pushed both patches to v3-4-test. Closing out bug report. Please re-open if it's still an issue. Thanks!
Shouldnt that go into 3.3 as well ?
I think there is a bug in attachment #5 [details] https://bugzilla.samba.org/attachment.cgi?id=4829&action=view The new code doesn't initialize the fstring mapped_user from the state->request->data.auth.user value, in fact it doesn't initialize it at all. Further patch for 3.4.3 to follow. Jeremy.
Created attachment 4844 [details] git-am format patch for 3.4.3 to fix problem with attachment #5 [details]
Created attachment 4845 [details] git-am patch for 3.3.9 Here is the same patch for the 3.3.9 codebase as attachment 4829 [details] and attachment 4844 [details] combined. It should fix the issue for 3.3.9. Not that the patch in attachment 4837 [details] is not needed as the code in 3.3.x doesn't use talloc here. I think this needs to go into 3.3.9 as it's a nasty interface misuse that can easily lead to winbindd crashes. Jeremy.
Sorry, it's too late to include it in 3.3.9. Can be shipped with 3.3.10 once review has been granted.
If my summary is objectionable, please feel free to fix (or suggest) in a better manner. Re: Volker's patch to "idmap_adex/provider_unified.c": New behavior: WinbdindD no longer knows who any AD-based user/group is. Will try to leave AD/delete host account/re-join, but it is a datapoint. When running winbindd -d 10 under valgrind, are the default CLI switches to valgrind sufficient or are there others which will make the output more useful?
Created attachment 4854 [details] winbindd + both patches, -d 10, fails to look up AD users.
"net ads testjoin -P" claims that the "Join is OK". "id" or "id pmay" shows a lack of knowledge about AD-based users/groups. "getent passwd" pauses, there is a lot of activity in winbindd's log (running -F -S -i -d 3), and then nothing from AD is listed. $ find /usr/lib64 -name adex* -exec ls -alF {} \; -rwxr-xr-x 1 root root 159413 Oct 13 16:02 /usr/lib64/idmap/adex.so* lrwxrwxrwx 1 root root 16 Sep 24 09:29 /usr/lib64/nss_info/adex.so -> ../idmap/adex.so The above output *seems* sane. Looking through the output, saw this: ------------------------------------------------------------ winbindd/idmap_adex/likewise_cell.c:382(cell_do_search) cell_do_search: Base = , Filter = (|(&(uid=pmay)(objectclass=User))(&(displayName=pmay)(objectclass=Group))), Scope = 2, GC = yes winbindd/idmap_adex/likewise_cell.c:397(cell_do_search) cell_do_search: Located 0 entries winbindd/idmap_adex/provider_unified.c:398(check_result_unique_scoped) Failed! (NT_STATUS_OBJECT_NAME_NOT_FOUND) ------------------------------------------------------------ "ldapsearch" below, however, returns my user record: ------------------------------------------------------------ ldapsearch -h nyc-wdc-000 -x -LLL -b "CN=Users,DC=Delacy,DC=COM" -v -D "cn=a_pmay,ou=Service Accounts,DC=Delacy,DC=COM" -W "(|(&(uid=pmay)(objectclass=User))(&(displayName=pmay)(objectclass=Group)))" dn Enter LDAP Password: filter: (|(&(uid=pmay)(objectclass=User))(&(displayName=pmay)(objectclass=Group))) requesting: dn dn: CN=Pavel May,CN=Users,DC=DELACY,DC=com ------------------------------------------------------------ Colour me confused. (A lovely shade of mauve taupe, in case you're wondering).
Comment on attachment 4844 [details] git-am format patch for 3.4.3 to fix problem with attachment #5 [details] Yes, absolutely correct. Otherwise, any PAM_AUTH request to a trusted domain will end up being sent to the local SAM child.
Comment on attachment 4845 [details] git-am patch for 3.3.9 same here.
Karolin, please pull the additional patch from Jeremy for 3.4.3 (and for 3.3.9).
Switching to the "ad" backend from "adex", with the patch 4829, fixes the crashes under the previous "crud, it crashed" conditions.
Pavel, I am reopening so that Karolin can pick the remaining patch which we need in any case.
Guenther, Far be it from me to object. Thanks for all the help.
(In reply to comment #28) > Karolin, please pull the additional patch from Jeremy for 3.4.3 (and for > 3.3.9). > Done. Closing out bug report. Thanks!