Since upgrading to Debian lenny (and hence samba 3.0->3.2), smbd has occasionally been filling /var/log/samba/$host.log with messages like: [2009/06/17 16:12:03, 0] smbd/notify_inotify.c:inotify_handler(239) No data on inotify fd?! [2009/06/17 16:12:03, 0] smbd/notify_inotify.c:inotify_handler(239) No data on inotify fd?! This results in 100% CPU usage by the process and (eventually) /var/log filling up, despite the "max log size = 1000" setting in smb.conf. As best I can tell, it seems to be due to smbd running out of fds, with something like 9000 entries in /proc/$PID/fd/. (I've been more interested in killing the processes and getting the disk space back than diagnosing) That in turn seems to be due to smbd having thousands of fds open for a single Excel spreadsheet, eg from a currently running process: $ sudo ls -l /proc/22824/fd |cut -d\> -f2|sort|uniq -c|sort -n|tail -n2 2 /var/log/samba/log.win 1921 /srv/SHARE/PATH/FILE.Xls It's possibly happening with .Doc files too, but to a much lesser extent (from the last crash earlier today, one of the smbd instances had 200 fds pointing at a .Doc file, along with 2345 fds pointing to an .Xls. (Since writing the above paragraph, pid 22824 has hit 2157 fds for that .Xls. It's now up to 2201 fds; timing it indicates it's gaining 6 in 10s) Even better, when the spreadsheet gets saved in Excel, the fds all get closed, though they immediately start getting reopened at a rate of 2 or 3 every five seconds. smbstatus only reports the .Xls opened once. (This is Excel 2003 running under Windows XP, using a spreadsheet on a mapped drive--F:\PATH\FILE. The filename has alphanumerics, spaces, round brackets and dots in it, if that matters)
Hmm, I turned oplocks off completely shortly after filing the bug, and it seems to have stopped misbehaving. (level2 oplocks had been turned off some time ago)
I'd like to see a network trace of the misbehaviour. If you're brave, a comparative trace with the same file on Windows would be outright brilliant :-) Volker
We've just started hitting this (RHEL5.1, kernel 2.6.18-53.el5PAE, samba 3.0.25b-0.el5.4) - last time, it happened in the early evening, and we didn't see it until the morning: by that time, /var/log/messages was 16.5g! We've got oplocks turned off, and it doesn't seem to be helping. Is it possible that this is another aspect of #6693 - and if so, is it possible that the fix for #6693 might also fix this?
Our companies appliance has experienced this problem as well on two occasions. Our setup is a SLES 10 based kernel (2.6.16.21) while running a slightly modified version of 3.0.34. If anyone has any additional information on how to reproduce this in a reliable fashion, or some debug logs / network captures --we may be able to lend a hand
Possibly related (1000+ fd's + full logs, but different error output) https://bugzilla.samba.org/show_bug.cgi?id=7624 My bug started around the same time as this one was reported.