One of our smb mounts sometimes only shows a small amount of files. We e.g. do a "ls -1 | wc -l" and noticed that sometimes only 148 files are shown but the directory contains 2074 files. Just a few seconds later "ls -1 | wc -l" will again show a different amount of files. vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 748 vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 1348 vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 148 vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 298 vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 298 vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l 148 Problem emerges on these kenel versions: * kernel 2.6.27-gentoo-r10 and cifs.ko 1.54 * kernel 2.6.29-gentoo-r1 and cifs.ko 1.57 Always followed by something like this in /var/log/messages Apr 21 14:30:33 vam-uti1 CIFS VFS: No response for cmd 50 mid 36461 Apr 21 14:32:25 vam-uti1 CIFS VFS: No response for cmd 50 mid 20610 Apr 21 14:32:46 vam-uti1 CIFS VFS: No response for cmd 114 mid 20611 Apr 21 14:33:07 vam-uti1 CIFS VFS: No response for cmd 114 mid 20613
Created attachment 4077 [details] tcpdump -i bond0 -p -s 1500 -C1 -w prod-fs.cap port 445 or port 139 and host 10.2.1.155 tcpdump capture attachted.
Created attachment 4078 [details] output of "strace -fo ls.log ls -1 | wc -l" attached output of "strace -fo ls.log ls -1 | wc -l" attached
Is this still reproducible?
Looks like the server is frequently resetting the connection. Do we have a mechanism to restart the search when a reconnection event occurs? If not, that's what may be needed here. Of course, if the server is dropping connections as much as this capture shows, that may not help (and performance will suck anyway).
Yes, I am still able to reproduce the problem on this machine Linux vam-uti1 2.6.25-gentoo-r9 #1 SMP Mon Nov 10 10:48:10 CET 2008 i686 Intel(R) Xeon(TM) CPU 2.80GHz GenuineIntel GNU/Linux filename: /lib/modules/2.6.25-gentoo-r9/kernel/fs/cifs/cifs.ko version: 1.52 description: VFS to access servers complying with the SNIA CIFS Specification e.g. Samba and Windows license: GPL author: Steve French <sfrench@us.ibm.com> srcversion: 1B0D147F2ACB24FC739C683 depends: vermagic: 2.6.25-gentoo-r9 SMP mod_unload PENTIUM4 parm: CIFSMaxBufSize:Network buffer size (not including header). Default: 16384 Range: 8192 to 130048 (int) parm: cifs_min_rcv:Network buffers in pool. Default: 4 Range: 1 to 64 (int) parm: cifs_min_small:Small network buffers in pool. Default: 30 Range: 2 to 256 (int) parm: cifs_max_pending:Simultaneous requests to server. Default: 50 Range: 2 to 256 (int) Please let me know if you need further information (tcpdump,logs whatever) TIA Jodok Ole Muellers
How about with a newer kernel? Say something ~2.6.30-ish?
Hello, Unfortunately we can't switch to a newer kernel since this is macchine is our production machine and we do not want to touch it. We further belive that this problem is caused by our Windows Server "frequently resetting the connection". The only problem I see from samba is that cifs.ko should report an I/O error if there is an interupption instead of just aborting. Thanks Jodok
Ok, I'm not sure we can do much here if you're not able to move to a newer kernel. If you can reproduce this on something more recent, then I'd be more interested. We've fixed quite a few bugs in the readdir code over the last several years, so it's possible this is already fixed, but without some more investigation it's difficult to know for sure. Please reopen if you have a way to experiment with more recent kernels.