Created attachment 6551 [details] patch If we try to lookup a unix user which is in our domain or in builtin domain winbind creates a new user instead of finding the unix user. I've found this bug with running spoolss as a seperate daemon.
Ok Andreas, can you explain in a lot more detail what this fixes, and exactly what cases you're using to test this ? Thanks, Jeremy.
8d0e2415309ee15fbf509d44caf3e1df5b900620 removed these checks You can reproduce it in master with: make test TESTS=spoolss INCLUDE_CUSTOM_CONF="rpc_server:spoolss = daemon"
I try to write down a howto for a PDC to reproduce it if 'make test' in this case isn't enough later this day.
From the IRC log showing my conversations with Andreas on this bug: ------------------------------------------------------------- wanon 16:13 The patch you want to revert is vl's change. He needs to review. Ok, add info to the bug report showing how to report with wbinfo. 16:13 gladiac 16:13 it worked before and it doesn't work anymore after this patch wanon 16:13 A good example of "it" is needed gladiac 16:14 isn't make test enough to reproduce it? wanon 16:14 Enough to reproduce, but not enough to tell me why what it is doing is wrong. gladiac 16:14 the user is connecting as SAMBA-TEST\user where SAMBA-TEST is our own domain 16:15 wanon 16:15 finally, some fucking examples. Thanks ! gladiac 16:15 user is a unix account it tries to lookup SAMBA-TEST\user but doesn't find user 16:15 so it allocates an id for user 16:15 instead of checkning the unix account 16:15 wanon 16:15 Are we a DC ? gladiac 16:16 yes wanon 16:16 Is SAMBA-TEST the machine name or the DOMAIN name. gladiac 16:16 in make test it is a s3dc SAMBA-TEST is the domain name 16:16 ok, I will try to write down an example 16:16 how to test it with s3 as a pdc 16:17 wanon 16:17 What about the case where we are a member server, and SAMBA-TEST is the domain name. We need to allocate then. What about the case where we are a member or standalone server, and SAMBA-TEST is the machine name. We should *not* allocate then. 16:18 We need to understand the effects of the change on all these cases. 16:18 That's why I want examples. 16:18 gladiac 16:18 ok I think we didn't understand it in the first case 16:18 where it was removed 16:18 wanon 16:19 Possibly - but that was Volker's change, so you'll need to check with him. -------------------------------------------------------------
Comment on attachment 6551 [details] patch Adding Volker as the change is one he originally made. Jeremy.
Are there any nss caches in 'make test' we don't have in the real world? I can only reproduce it in 'make test' at the moment...
Re-assigning to Karolin for inclusion in 3.6.0final as Volker has given positive review. Jeremy.
Pushed to v3-6-test. Closing out bug report. Thanks!
This patch is wrong and needs to be reverted, because: - It's just luck that it fixes the problem with make test TESTS="spoolss" INCLUDE_CUSTOM_CONF="rpc_server:spoolss = daemon" - sam_rids_to_names() gets a domain sid as input so it'll never be a child sid of our domain or the builtin domain, which means it'll never resolve rids in our domain or the buildtin domain, but instead always print this: sam_rids_to_names: possible deadlock - trying to lookup SID S-1-5-21-1715343321-2184876875-2541585818 and returns always NT_STATUS_NONE_MAPPED - It triggers another bug which makes the samba3.posix_s3.winbind.wbclient .wbcListUsers flakey as the code used some uninitialized memory resulting in: ==9072== Conditional jump or move depends on uninitialised value(s) ==9072== at 0x48CCA7: winbindd_sids_to_xids_lookupsids_done (winbindd_sids_to_xids.c:189) ==9072== by 0x68814FC: _tevent_req_notify_callback (tevent_req.c:95) ==9072== by 0x688152E: tevent_req_finish (tevent_req.c:104) ==9072== by 0x6881555: _tevent_req_done (tevent_req.c:110) ==9072== by 0x47EE8B: wb_lookupsids_next (wb_lookupsids.c:237) ==9072== by 0x47FA1A: wb_lookupsids_single_done (wb_lookupsids.c:534) ==9072== by 0x68814FC: _tevent_req_notify_callback (tevent_req.c:95) ==9072== by 0x688152E: tevent_req_finish (tevent_req.c:104) ==9072== by 0x6881627: tevent_req_trigger (tevent_req.c:155) ==9072== by 0x6880BE3: tevent_common_loop_immediate (tevent_immediate.c:135) ==9072== by 0x75BEECA: run_events_poll (events.c:197) ==9072== by 0x75BF736: s3_event_loop_once (events.c:327) ==9072== by 0x687FD76: _tevent_loop_once (tevent.c:494) ==9072== by 0x43F4D4: main (winbindd.c:1431) Try TDB_NO_FSYNC=1 WINBINDD_VALGRIND="valgrind --log-file=metze.vg-wb --num-callers=30 --track-origins=yes" make test TESTS="samba3.posix_s3.winbind.wbclient"
Created attachment 6592 [details] Revert patch and valgrind fixes for v3-6-test
(In reply to comment #10) > Created attachment 6592 [details] > Revert patch and valgrind fixes for v3-6-test Pushed to v3-6-test. Metze, the original issue still exists, right?
Yes, but it's fixed in master by commit 8d72e612ac2845cd873c4fd614456fe8749db130 s3-rpc_server read and write the unix_token and unix_info across named_pipe_auth (and other related commits in the push of d8cce7d466b1fb122136a464e978f71483ab0e09).
I don't think we'll fix this for 3.6.x