On my PDC, wbinfo -g fails, even after having applied the patches http://git.samba.org/?p=samba.git;a=commitdiff;h=0e407e77ad5af4bb78b1e6da247398f2677618ee and http://git.samba.org/?p=samba.git;a=commitdiff;h=92d067497c60c5eab43bae11e6c1236e6cf760bb wbinfo -u, wbinfo -t and wbinfo -D succeeds, but wbinfo -g still fails with # wbinfo -g Error looking up domain groups However, wbinfo -g on a domain member works without problems. On startup with debuglevel 1, winbind logs the following errors: ==> /var/log/samba/log.winbindd <== [2008/06/20 10:26:11, 1] nsswitch/winbindd.c:main(1001) winbindd version 3.0.30-2leo.fc5 started. Copyright Andrew Tridgell and the Samba Team 1992-2008 [2008/06/20 10:26:11, 0] nsswitch/winbindd_cache.c:initialize_winbindd_cache(2229) initialize_winbindd_cache: clearing cache and re-creating with version number 1 [2008/06/20 10:26:24, 0] libsmb/clientgen.c:cli_receive_smb(111) Receiving SMB: Server stopped responding [2008/06/20 10:26:37, 0] libsmb/clientgen.c:cli_receive_smb(111) Receiving SMB: Server stopped responding [2008/06/20 10:26:49, 0] libsmb/clientgen.c:cli_receive_smb(111) Receiving SMB: Server stopped responding [2008/06/20 10:31:49, 1] rpc_client/cli_pipe.c:cli_pipe_validate_current_pdu(625) cli_pipe_validate_current_pdu: RPC fault code DCERPC_FAULT_OP_RNG_ERROR received from remote machine SAMBA pipe \lsarpc fnum 0x76a1! ==> /var/log/samba/log.winbindd-idmap <== [2008/06/20 00:31:48, 0] nsswitch/winbindd_dual.c:async_request_timeout_handler(181) async_request_timeout_handler: child pid 2319 is not responding. Closing connection to it. [2008/06/20 00:31:48, 1] nsswitch/winbindd_util.c:trustdom_recv(236) Could not receive trustdoms [2008/06/20 00:32:24, 1] nsswitch/idmap.c:idmap_init(377) Initializing idmap domains [2008/06/20 00:37:24, 0] nsswitch/winbindd_dual.c:async_request_timeout_handler(181) async_request_timeout_handler: child pid 2392 is not responding. Closing connection to it. [2008/06/20 00:37:24, 1] nsswitch/winbindd_util.c:trustdom_recv(236) Could not receive trustdoms [2008/06/20 10:26:14, 1] nsswitch/idmap.c:idmap_init(377) Initializing idmap domains [2008/06/20 10:31:12, 0] nsswitch/winbindd_dual.c:async_request_timeout_handler(181) async_request_timeout_handler: child pid 7188 is not responding. Closing connection to it. [2008/06/20 10:31:12, 1] nsswitch/winbindd_util.c:trustdom_recv(236) Could not receive trustdoms The output of winbind startup and wbinfo -g at loglevel 10 and smb.conf are available at http://leo.kloburg.at/tmp/winbind/ Additional info: # wbinfo --own-domain RK_KLBG # wbinfo -t checking the trust secret via RPC calls succeeded # wbinfo -D RK_KLBG Name : RK_KLBG Alt_Name : SID : S-1-5-21-181998944-1107627502-2274996074 Active Directory : No Native : No Primary : Yes Sequence : -1 Any hints? --leo
Ok, you're using an idmap backend with LDAP. I also need to see the debug level 10 logs from smbd at the same time as the wbinfo call hangs. I'm trying to see if there's a recursion issue with smbd calling back into winbindd. It doesn't look like it, as the winbindd -> smbd call from the parent winbindd is hanging on authentication, and the call from the RK_KLBG winbindd child to another smbd works perfecty. I need to see the smbd logs split out per connecting process please. Jeremy.
Thanks for your quick reply! Do you need the smbd logd for winbindd startup or when doing a "wbinfo -g"? (These authentication error messages appear on startup, before the "wbinfo -g" call is done.) I've added a log of the smbd processes during a start of winbind ("winbindd -d10 -n -F -Y") with "wbinfo -g" called after the startup-messages had settled down: http://leo.kloburg.at/tmp/winbind/2/ If that is not informative enough, please give me a hint on how I can split the smbd-logs "per connecting process". I've only splitted smbd output per process-id now. Sorry about my ignorance but I don't know the smbd internals very well... P.S.: It is the first time I am trying to use winbindd on a PDC. (It has never worked before on the PDC.) Thanks, --leo
Created attachment 3355 [details] Patch Can you try this patch please ? Jerry and I think this should fix the problem. If you can confirm asap we'll get this into 3.0.31. Thanks ! Jeremy.
(In reply to comment #3) > Can you try this patch please ? Jerry and I think this should fix the problem. Thanks! Unfortunately it didn't fix the problem. :( As before, wbinfo -u works but -g doesn't: Error looking up domain groups I've added another set of log files at http://leo.kloburg.at/tmp/winbind/3/ ... and included a tarball (logs.tar.gz) for easier downloading and grepping. Thanks, --leo
Some additional info, maybe it helps: After winbindd startup, wbinfo commands like "-u" or "-t" block until about 35 seconds after winbindd's startup. After that, wbinfo -u and -t work (-g still doesn't) but I noticed that after winbindd ran for some time (some hours) wbinfo -t stopped working: # wbinfo -t checking the trust secret via RPC calls failed error code was (0x0) Could not check secret Cheers, --leo
There was a problem with the patch. It wasn't correctly allowing the machine account to log in. So winbindd is falling back to attempting to log in with the smbadmin user. The smbadmin user is in several supplementary groups and so tries to look up the gid 500 (one of the groups that smbadmin is in). See here in the logs : 2008/06/21 10:43:19, 3] smbd/sec_ctx.c:pop_sec_ctx(356) pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 1 [2008/06/21 10:43:54, 5] passdb/lookup_sid.c:gid_to_sid(1406) gid_to_sid: winbind failed to find a sid for gid 500 Note the 35 second delay. That's where it's trying to call back into winbindd. I'm going to post an updated version of the patch. Can you try replace the other patch with this one and let me know ? Jeremy.
Created attachment 3363 [details] Replacement patch
I have a further idea on how to address this generically. I need to check with Volker and Jerry and get back to you. Hopefully tomorrow. Jeremy.
(In reply to comment #6) > Note the 35 second delay. That's where it's trying to call back into winbindd. > I'm going to post an updated version of the patch. Can you try replace the > other patch with this one and let me know ? Sorry, it still doesn't work. The delay is still there and wbinfo -g still fails. I'll try to produce a new set of log-files in the evening. Cheers, --leo
Created attachment 3367 [details] Patch - different approach to solving the problem. Ok, this patch should prevent smbd from contacting winbindd *at all* if it's a successful machine account login from our local machine account. Please test this one and give me logs back if it fails. I don't have the LDAP set up you have so I'm depending on you to give this a good workout, thanks ! Jeremy.
(In reply to comment #10) > Created an attachment (id=3367) [edit] > Patch - different approach to solving the problem. > > Ok, this patch should prevent smbd from contacting winbindd *at all* if it's a > successful machine account login from our local machine account. Thanks! Looks better now - the delay is gone. wbinfo -u or -t work immediately after winbindd startup now. However, wbinfo -g still produces Error looking up domain groups ... but maybe that's a different problem now? The new logs are at http://leo.kloburg.at/tmp/winbind/4/ (winbindd startup was at 23:16:19, wbinfo -g was executed at 23:16:42.) Thanks for your efforts! --leo
This is definately a different problem from winbindd hanging contacting smbd now. In your setup winbindd is trying to contact the LDAP server to enumerate "local groups" for the BUILTIN domain. The BUILTIN domain has a sid-prefix of S-1-5-32. In the log file log.winbindd-22708 I can see this by the following entry: smbldap_search_ext: base => [ou=Groups,dc=rk-klbg,dc=at], filter => [(&(objectclass=sambaGroupMapping)(sambaGroupType=4)(sambaSID=S-1-5-32*))], scope => [2] My guess is your LDAP server is not returning any BUILTIN groups here. There should be at least two (there are in the tdb sam case). These are : BUILTIN\administrators -> S-1-5-32-544 BUILTIN\users -> S-1-5-32-545 as you're using an LDAP store you'll have to create these by hand I think. winbindd is failing the call as it finds zero results from the above query. If you add these the call should complete. I think this patch is now good, and will commit to the 3.0.x and 3.2 trees to fix this bug, but I'll keep the bug open to ensure it works once you've created these entries. Jeremy.
(In reply to comment #12) > smbldap_search_ext: base => [ou=Groups,dc=rk-klbg,dc=at], filter => > [(&(objectclass=sambaGroupMapping)(sambaGroupType=4)(sambaSID=S-1-5-32*))], > > My guess is your LDAP server is not returning any BUILTIN groups here. There > should be at least two (there are in the tdb sam case). These are : > > BUILTIN\administrators -> S-1-5-32-544 > BUILTIN\users -> S-1-5-32-545 You are right, the ldap search above doesn't return any results. That's because the two builtin groups do exist in LDAP but they have a different sambaGroupType of 5. (SID_NAME_WKN_GRP = 5, /* well-known group */) smbldap-populate (part of smbldap-tools) creates them that way. Do you have any pointers on how to correctly create those builtin groups? Should they all have a sambaGroupType of 4 instead of 5? Should they have a mapping to a local unix group? Are there any templates for creating those groups? Sorry for bothering you with that stuff but the only advice on creating an initial LDAP setup that I found in the Samba-Guide was to use smbldap-populate... And why does wbinfo -g work on a domain-member? (Most of the time, see below..) Another interesting problem: I've just tried "wbinfo -g" on a Domain Member (running 3.0.30): # wbinfo -g BUILTIN\administrators BUILTIN\users [ called several times, returns only the two builtin groups] # wbinfo -u [returns all domain users] # wbinfo -g [now returns all domain groups too] Maybe this is a reconnect-issue that is only handled properly when doing user enumeration? Unfortunately I only have logs of debug-level 1: ------------------------------------------------ [2008/06/25 08:57:29, 0] libsmb/clientgen.c:cli_receive_smb(111) Receiving SMB: Server stopped responding [2008/06/25 08:57:29, 0] rpc_client/cli_pipe.c:rpc_api_pipe(790) rpc_api_pipe: Remote machine SAMBA pipe \samr fnum 0x741areturned critical error. Error was Call returned zero bytes (EOF) [2008/06/25 08:58:28, 1] libsmb/clientgen.c:cli_rpc_pipe_close(386) cli_rpc_pipe_close: cli_close failed on pipe \samr, fnum 0x741a to machine SAMBA. Error was Call returned zero bytes (EOF) [2008/06/25 08:58:28, 1] libsmb/clientgen.c:cli_rpc_pipe_close(386) cli_rpc_pipe_close: cli_close failed on pipe \NETLOGON, fnum 0x7419 to machine SAMBA. Error was Call returned zero bytes (EOF) ------------------------------------------------ Many thanks for your help! Cheers, --leo
Some additional notes: (In reply to comment #13) > You are right, the ldap search above doesn't return any results. > That's because the two builtin groups do exist in LDAP but they have a > different > sambaGroupType of 5. (SID_NAME_WKN_GRP = 5, /* well-known group */) > smbldap-populate (part of smbldap-tools) creates them that way. After changing the sambaGroupType of all S-1-5-32-* groups to 4, wbinfo -g returns only the BUILTIN\* groups. :( On a domain member, I can see the domain groups as well. (They all have sambaGroupType: 2.) The logs are at http://leo.kloburg.at/tmp/winbind/5/ Thanks, --leo
The mis-numbered group type is a bug in smbldap-tools, they should be created as local groups (alias) - type 5, not type 5. I'm still looking into the other problem. Jeremy.
Arg - typo - meant "as type 4, not type 5". Jeremy.
Created attachment 3369 [details] Patch for smbldap-tools-0.9.5
Created attachment 3370 [details] Final ? patch This patch should fix the problem. winbindd wasn't searching it's own domain group map, I've added that now. Please test asap as I'd like to get this into Samba 3.2.0 due out July 1st. Jeremy.
Looks much better now. But still not perfect... ;-) The domain groups are shown now but unfortunately each domain group is additionally also shown as BUILTIN now: # wbinfo -g BUILTIN\domain admins BUILTIN\domain users BUILTIN\domain computers BUILTIN\rk BUILTIN\zivi [...] BUILTIN\administrators BUILTIN\users [...] domain admins domain users domain computers rk zivi [...] Cheers, --leo
Yes, I figured that might be the case on my drive home. I have an idea on a quick fix... Jeremy.
Created attachment 3371 [details] Additional patch This should do it. Apply this on top of the "final ? patch" and it should be good to go. I forgot to split out the functions for enumerating groups for BUILTIN and DOMAIN into separate functions so the damn thing was being called twice :-). Thanks for your help. Jeremy.
ACK! Now everything looks fine! Thank you very much! Do you have any hint what might have caused that strange behavior on my domaim member, described in comment #13 after "Another interesting problem": https://bugzilla.samba.org/show_bug.cgi?id=5551#c13 (I couldn't reproduce this behavior since then.) --leo
Looks like it could have been a transient failure in talking to the LDAP server, but without better logs it's impossible to be sure, sorry. Thanks for so rapidly testing what ended up being a long series of patches, sorry for the trouble. These fixes are now in git and will be in 3.0.31 and also 3.2.0. Cheers, Jeremy.
Fixed with latest patch. Jeremy.