The Samba-Bugzilla – Bug 4282
memory leak and fileserving fails within a single smbd process
Last modified: 2009-01-03 12:00:03 UTC
I'm now running 3.0.23d, as a PDC, with ldapsam backend, openldap 2.2.23, on linux kernel 2.4.27-3-686-smp.
I've got an smbd process for one particular windows host that steadily climbs from 12MB on startup on up to 260MB within a week, and then fails.
I've taken several strace's once it's failed.
The process looks like this once it's frozen:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
IUSR_JU 8083 8.9 12.4 264368 258412 ? S Nov27 522:20 /usr/sbin/smbd -D
Some background: the windows client is a win2k IIS5 server, with well over 60 sites that are hosted off a samba share on this linux server. Smbstatus tells me it's got 60 to 65 connections open for that client (IUSR_JUNIPER).
There's 2GB ram in this server so I don't think it ran out of memory, the system reports about 1GB free (-/+ buffers/cache).
The only way to recover is send a sigkill to the process and restart samba. The process doesn't respond to sigterm.
workgroup = FOREST
server string = %h domain server
interfaces = <snip>
bind interfaces only = Yes
obey pam restrictions = Yes
passdb backend = ldapsam:ldaps://kingwood.ourdomain.com/
pam password change = Yes
passwd program = /usr/bin/passwd %u
passwd chat = *Enter\snew\sUNIX\spassword:* %n\n *Retype\snew\sUNIX\spassword:* %n\n *password\supdated\ssuccessfully* .
unix password sync = Yes
log level = 1
syslog = 0
log file = /var/log/samba/log.%m
max log size = 1000
max mux = 2048
time server = Yes
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
add user script = /usr/sbin/smbldap-useradd -m '%u'
delete user script = /usr/sbin/smbldap-userdel '%u'
add group script = /usr/sbin/smbldap-groupadd -p '%g'
delete group script = /usr/sbin/smbldap-groupdel '%g'
add user to group script = /usr/sbin/smbldap-groupmod -m '%u' '%g'
delete user from group script = /usr/sbin/smbldap-groupmod -x '%u' '%g'
set primary group script = /usr/sbin/smbldap-usermod -g '%g' '%u'
add machine script = /usr/sbin/smbldap-useradd -w '%u'
logon script = %U.cmd
logon path =
logon drive = H:
logon home =
domain logons = Yes
preferred master = Yes
domain master = Yes
dns proxy = No
wins support = Yes
ldap admin dn = cn=samba,ou=DSA,dc=ourdomain,dc=com
ldap group suffix = ou=Group
ldap idmap suffix = ou=Idmap
ldap machine suffix = ou=Computers
ldap passwd sync = Yes
ldap suffix = dc=ourdomain,dc=com
ldap user suffix = ou=People
message command = /bin/sh -c '/usr/bin/linpopup "%f" "%m" %s; rm %s' &
panic action = /usr/share/samba/panic-action %d
idmap backend = ldap:ldaps://kingwood.ourdomain.com/
idmap uid = 10000-20000
idmap gid = 10000-20000
template shell = /bin/bash
force unknown acl user = Yes
use sendfile = Yes
dos filemode = Yes
comment = Home Directories
read only = No
create mask = 0700
directory mask = 0700
browseable = No
comment = Network Logon Service
path = /var/lib/samba/netlogon/scripts
guest ok = Yes
share modes = No
root preexec = /var/lib/samba/netlogon/scripts/logon.pl %U %I
comment = All Printers
path = /tmp
create mask = 0700
printable = Yes
browseable = No
comment = Printer Drivers
path = /var/lib/samba/printers
write list = root, @ntadmin
comment = Shared Web Hosting
path = /home/web
read only = No
create mask = 0770
directory mask = 0770
Here's some strace's (too large to attach):
I'm afraid the straces are useless to diagnose the problem. What you can do is increase the debug level of that one growing process with
smbcontrol <pid> debug 10
and send us the resulting log file. (btw, 80kb is not really large!). Please set 'max log size = 0', don't be afraid to send in many megabytes of log files.
If you're done with logging you can decrease the debug level with
smbcontrol <pid> debug 0
BTW, can you try to set 'max stat cache size = 1000' and see if it helps?
Thanks. I've added the two settings, unfortunately the smbd process is still growing gradually.
I took two different log snapshots at level 10. These are approx 100mb logs (uncompressed) during which the smbd process grew about 200-500k in memory. I'm watching the process grow an 3mb now
I'll do this again once it reaches the threshhold where fileserving fails.
It's grown now to 58MB, so this slow growth in memory still continues.
If you'd like more logs, let me know.
Just to give you feedback: The logs are more helpful, thanks. I tried to reproduce a memleak for some of the more unusual calls that your client makes. In particular, it queries the file's security descriptor and it tries to connect to the [web] share as guest, which fails. Both did not show obvious memleaks here.
To check all the calls right now I don't have the time, so it might take a bit.
Alright I understand it's a lot to debug with the 100MB debug files, what can we do narrow down the problem so it's easier to locate the call? I need to bring some more stability to this server soon.
I've tried watching the process in top, and run a debug 10 for just a few seconds while the process grows a little. That log is now at,
I'm also not sure why the DFS calls appear in the log, so I disabled DFS on the samba server with 'host msdfs = no' (an there are no dfs root's defined). That had a bad effect because the client could no longer connect to the fileshares via a UNC path, so that's very odd since it's not a DFS tree.
Since that's something unexpected, can you check the dfs calls in the logs for memleak's?
There has to be something different about this setup than most, because samba's known for it's stability. My guess is either ldap related, or the behaviour of the client which is win2k/IIS5 and for that the limit on MaxMpxCT is lifted to 2048 (instead of the default of 50) via the setting 'max mux = 2048'.
I suspect this bug to be fixed in 3.2 as the parts affected (enhanced by the patch proposed by Andrew bartlett) have changed a lot since then