Bug 3832 - Samba 3.0.22 + Windows XP clients + tons of smbd processes.
Samba 3.0.22 + Windows XP clients + tons of smbd processes.
Status: RESOLVED WORKSFORME
Product: Samba 3.0
Classification: Unclassified
Component: File Services
3.0.22
x86 Windows XP
: P3 critical
: none
Assigned To: Samba Bugzilla Account
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-06-13 10:15 UTC by Ryan
Modified: 2006-06-22 15:15 UTC (History)
0 users

See Also:


Attachments
An output from "tail -f log.localhost" on the Samba server (19.85 KB, text/plain)
2006-06-13 13:11 UTC, Ryan
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan 2006-06-13 10:15:41 UTC
Hello all, I believe this to be a bug but I'm not quite sure.  I figure it's up to the experts to decide.  I've noticed that Windows XP clients that connect to my Samba server generate *tons* of smbd processes, with one or more of a few consequences: users lose network connectivity, Samba requests time out, and the Samba server seems to grind to a near halt.  These smbd processes don't go away even if Samba is stopped with /etc/init.d/samba stop... I have to kill -9 them.  They all end up with root as the owner.  This only happens from XP machines.  The logs don't seem to yield anything useful.  It may be interesting to note that smbstatus sometimes shows users with multiple IPC$'s.  I duplicated this in a test environment on a completely separate network.  The Samba server is a Debian Sarge box.  If you need any more information, I will do my best to provide it.  I've not heard back from Jeremy Allison regarding this issue, and figuring it might be a bug, thought this to be the next best step.  Please advise if further information is needed.  Thanks!
Comment 1 Ryan 2006-06-13 10:16:44 UTC
This is for a production environment, and the customer is getting very impatient.  I really like Samba, so I want to stick with it and avoid going the AD route :(
Comment 2 Ryan 2006-06-13 11:20:41 UTC
It should be noted that this differs from 3636 because Samba doesn't actually crash...it just gets overwhelmed.  Also, this has been tested on two different kernels...2.6.5 and 2.6.8.
Comment 3 Ryan 2006-06-13 11:22:35 UTC
(In reply to comment #1)
> This is for a production environment, and the customer is getting very
> impatient.  I really like Samba, so I want to stick with it and avoid going the
> AD route :(
> 

Oops...didn't mean to say AD...meant M$.  Silly acronyms :)
Comment 4 Ryan 2006-06-13 11:29:01 UTC
Here's some output that might help... I thought it interesting that the virtual size in kb is roughly the same for all of the stale processes that build up.  This is just a small snippet:

root      6110  0.0  0.7 10608 3768 ?        S    10:47   0:00 /usr/sbin/smbd -D
root      6143  0.0  0.7 10608 3764 ?        S    10:48   0:00 /usr/sbin/smbd -D
root      6145  0.0  0.7 10608 3932 ?        S    10:48   0:00 /usr/sbin/smbd -D
root      6163  0.0  0.7 10608 3764 ?        S    10:50   0:00 /usr/sbin/smbd -D
root      6180  0.0  0.7 10608 3764 ?        S    10:51   0:00 /usr/sbin/smbd -D
root      6243  0.0  0.7 10608 3748 ?        S    10:54   0:00 /usr/sbin/smbd -D
root      6269  0.0  0.7 10608 3748 ?        S    10:55   0:00 /usr/sbin/smbd -D
root      6289  0.0  0.7 10608 3748 ?        S    10:56   0:00 /usr/sbin/smbd -D
root      6304  0.0  0.7 10608 3752 ?        S    10:57   0:00 /usr/sbin/smbd -D
root      6391  0.0  0.7 10608 3748 ?        S    11:05   0:00 /usr/sbin/smbd -D
root      7094  0.0  0.7 10608 3764 ?        S    11:06   0:00 /usr/sbin/smbd -D
root      7291  0.0  0.7 10608 3812 ?        S    11:23   0:00 /usr/sbin/smbd -D
root      7297  0.0  0.7 10608 3756 ?        S    11:24   0:00 /usr/sbin/smbd -D
root      7298  0.0  0.7 10616 3820 ?        S    11:24   0:00 /usr/sbin/smbd -D
root      7320  0.0  0.7 10616 3756 ?        S    11:26   0:00 /usr/sbin/smbd -D
root      7334  0.0  0.7 10616 3824 ?        S    11:27   0:00 /usr/sbin/smbd -D
root      7340  0.0  0.7 10616 3816 ?        S    11:28   0:00 /usr/sbin/smbd -D
Comment 5 Ryan 2006-06-13 11:47:50 UTC
One more note (sorry for the multiple posts...things just keep occurring to me):  It looks like the CPU time on all of those is 00:00...does this indicate that the connections were never really established...as if they tried to connect and failed but the process still stayed around?  Ugh, getting really confused here...thanks again for taking a look at this...
Comment 6 Ryan 2006-06-13 12:24:34 UTC
FYI, just tested this with an Apple Powerbook...the smbd daemon is killed properly.  So, it appears that my hunch was correct, this really is an interaction with Windows XP where these daemons aren't properly killed by the Samba server.
Comment 7 Ryan 2006-06-13 13:11:36 UTC
Created attachment 1958 [details]
An output from "tail -f log.localhost" on the Samba server

This is what is generated in the logs when I attempt to execute a "smbclient -L localhost -U user" from the server itself on the commandline after the Samba server has gotten into a state where there are a ton of built-up processes.  Here is the output to stdout:

abbott:/var/log/samba# smbclient -L localhost -U steele
Password:
Domain=[ASPA] OS=[Unix] Server=[Samba 3.0.22-Debian]

        Sharename       Type      Comment
        ---------       ----      -------
        netlogon        Disk
        print$          Disk
        public          Disk      Public Repository
        downloads       Disk      Helpful Downloads
        IPC$            IPC       IPC Service (Samba Server 3.0.22-Debian)
        ADMIN$          IPC       IPC Service (Samba Server 3.0.22-Debian)
        hplaserjet      Printer   hp8150dn
        steele          Disk      Home of steele, steele
session setup failed: Call timed out: server did not respond after 20000 milliseconds
NetBIOS over TCP disabled -- no workgroup available
Comment 8 Gerald (Jerry) Carter 2006-06-14 06:53:29 UTC
Do you have a deadtime value set in smb.conf ?  If not try that.  
Also look at the output from netstat and see if the connection 
is still ESTABLISHED.  On linux 'netstat -pant' will help to 
match the socket connection with a process id.
Comment 9 Gerald (Jerry) Carter 2006-06-22 15:15:41 UTC
user cannot reproduce.  Closing.