I'm running small Samba PDC (tdbsam backend, nothing fancy). To enable NTLM auth in Squid (runs on the same box), I had to run winbind. Since then I'm getting lots of "Receiving SMB: Server stopped responding" in logs, about few messages in few minutes. The problem with this is that smbd children really sometimes hang for a few seconds, stalling windows clients.
Hi, same problem here. Needed to enable NTLM-auth for a freeradius daemon. Since then, the same problem occures: Lot of "Receiving SMB: Server stopped responding" in winbindd-log and sometimes hanging for a few seconds (stalling windows clients). Samba Version: till 3.0.25 to 3.0.27a.
Do you have a reproducible test case? Some action or command that can be used to track down the smbd stalls? Can you attach a level 10 debug log (from all winbindd processes and a hung smbd) with timestamps illustrating the problem?
Created attachment 3012 [details] /var/log/messages with "SMB stopped responding"
Created attachment 3013 [details] smbd log with debug level 10
Created attachment 3014 [details] winbindd log with debug level 10
(In reply to comment #2) > Do you have a reproducible test case? Some action or command that > can be used to track down the smbd stalls? Can you attach a level 10 > debug log (from all winbindd processes and a hung smbd) with > timestamps illustrating the problem? I've been trying to find the cause for few days, trying various debug levels, running interactively in foreground, etc, but was unable to find the reliable way to get smbd to stall. The messages of SMB stopped on the other hand appear nearly all the time.
[2007/11/30 09:00:42.406797, 10, pid=5657] lib/system_smbd.c:sys_getgrouplist(125) sys_getgrouplist: user [godai$] 2007/11/30 09:00:52.405096, 5, pid=5660] lib/util_sock.c:print_socket_options(206) socket option SO_KEEPALIVE = 1 That sys_getgrouplist takes too long on smbd. Can you find out what "id godai$" does? strace that for example? What does your nsswitch.conf look like? Volker
(In reply to comment #7) > That sys_getgrouplist takes too long on smbd. Can you find out what "id godai$" > does? strace that for example? What does your nsswitch.conf look like? # LANG=C id godai$ uid=520(godai$) gid=1515(computers) groups=1515(computers) relevant entries in /etc/nsswitch.conf: passwd: files shadow: files group: files Just plain shadow md5-hashed files, not ldap/sql/nis. System in question Fedora 7, selinux disabled.
Well, I'm stuck at this point. Sorry. Volker
I have the same problem and I figured out that there is a connection to the 'winbind cache time' parameter in smbd.conf. If I set this to 60 sec. the 'SMB stopped respronding' message comes up every 60 seconds the times in a row about 10 seconds apart. I'm using 3.0.28 pdc-configuration with ldap-backend on debian etch
I too am having a problem with 3.0.25b packaged with CentOS 5.1 on i386. I receive the same messages at the same interval. I do not have any clients connected to it at this time, and this is a clean installation. I can reproduce it easily enough: 1. Install Samba 3.0.25b (I did as part of my CentOS 5.1 install) 2. Disabled SELinux 3. Updated smb.conf. Relevant portions below, let me know if I should attach. 4. Configured nsswitch.conf: passwd: files winbind shadow: files group: files winbind 5. Started Samba 6. Joined to domain: net rpc join -S <hostname> -U root 7. Started winbind 8. tailed winbindd.log and noted errors: winbindd version 3.0.25b-1.el5_1.4 started. Copyright Andrew Tridgell and the Samba Team 1992-2007 [2008/01/16 15:59:20, 0] nsswitch/winbindd_cache.c:initialize_winbindd_cache(2221) initialize_winbindd_cache: clearing cache and re-creating with version number 1 [2008/01/16 15:59:33, 0] libsmb/clientgen.c:cli_receive_smb(112) Receiving SMB: Server stopped responding [2008/01/16 15:59:46, 0] libsmb/clientgen.c:cli_receive_smb(112) Receiving SMB: Server stopped responding [2008/01/16 15:59:58, 0] libsmb/clientgen.c:cli_receive_smb(112) Receiving SMB: Server stopped responding Key smb.conf entries: ; PDC workgroup = LCG server string = LCG File Server local master = yes domain master = yes preferred master = yes domain logons = yes wins support = yes ; winbind # separate domain and username with '\', like DOMAIN\username winbind separator = \\ # use uids from 10000 to 20000 for domain users idmap uid = 10000-30000 # use gids from 10000 to 20000 for domain groups idmap gid = 10000-30000 # allow enumeration of winbind users and groups winbind enum users = yes winbind enum groups = yes template shell = /bin/bash ; security security = user encrypt passwords = yes pam password change = yes passdb backend = tdbsam unix password sync = yes guest account = guest Please let me know if I can provide any other info.
Hi there, I've spend some time to track down the problem. Summary: 1.) You have to run samba in PDC mode and additional the winbindd-process. 2.) If you set "disable netbios = yes" in smb.conf, the mentioned error is "solved". But in some cases you need "disable netbios = no", so I think it is just a workaround. Is there a bug in combination of SMB as PDC, running winbindd and set parameter "disable netbios = no"?? Greets, Michael
I'm starting to see this too. I need to start running winbindd because I need to trust another domain. When I first started running winbindd, things seemed mostly okay. I din't see these timeouts. But I couldn't seen any of the local (CO-RA) domain users with "wbinfo -u", just the trusted domain. From reading online I gathered that this might be because the local samba server had never been added to the local domain. So I created a computer account for it and had it join the domain as the PDC. After that, I started seeing these hangs in winbindd. Note that the server of the trusted domain has not yet been added to that domain as a PDC.
Hi there, the mentioned problem seems to be solved by using the latest release (3.0.28a). Greets, Michael
Can anyone else please confirm this? Was it just a straight upgrade? or was it a fresh install? Cheers
Hi Andrew, I tried both: A fresh install and an upgrade on a Samba PDC in a test-environment. No problems anymore. Perhaps it has something to do with the winbindd fix mentioned in the release notes of 3.0.28a? --- snip --- Simo Sorce * Don't assume NULL termination when copying the principal name in kerberos_get_default_realm_from_ccache(). * Fix winbindd running on a Samba DC (again). --- snap --- Have a nice day. Michael
Confirmed, 3.0.28a seems to fix this bug. I think the fix was in patch http://git.samba.org/?p=samba.git;a=commit;h=9347d34b502bef70cdae8f3e8acd9796dba49581
Thanks for the feedback.