Bug 6284 - files disappearing randomly - kernel 2.6.27 cifs.ko 1.54 - VFS: No response for cmd 50
files disappearing randomly - kernel 2.6.27 cifs.ko 1.54 - VFS: No response...
Status: RESOLVED WONTFIX
Product: CifsVFS
Classification: Unclassified
Component: kernel fs
2.6
x86 Linux
: P3 normal
: ---
Assigned To: Jeff Layton
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-04-22 03:02 UTC by Jodok Ole Muellers
Modified: 2012-04-28 00:49 UTC (History)
2 users (show)

See Also:


Attachments
tcpdump -i bond0 -p -s 1500 -C1 -w prod-fs.cap port 445 or port 139 and host 10.2.1.155 (319.07 KB, application/octet-stream)
2009-04-22 03:05 UTC, Jodok Ole Muellers
no flags Details
output of "strace -fo ls.log ls -1 | wc -l" attached (5.29 KB, text/plain)
2009-04-22 03:06 UTC, Jodok Ole Muellers
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jodok Ole Muellers 2009-04-22 03:02:56 UTC
One of our smb mounts sometimes only shows a small amount of files. 
We e.g. do a "ls -1 | wc -l" and noticed that sometimes only 148 files are shown but the directory contains 2074 files. 
Just a few seconds later "ls -1 | wc -l" will again show a different amount of files.

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
748

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
1348

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
148

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
298

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
298

vam-uti3:/data/dpa2msu/red-fs2/bild/agentur/DPA # ls -1 | wc -l
148

Problem emerges on these kenel versions:
* kernel 2.6.27-gentoo-r10  and cifs.ko 1.54
* kernel 2.6.29-gentoo-r1   and cifs.ko 1.57

Always followed by something like this in /var/log/messages
Apr 21 14:30:33 vam-uti1 CIFS VFS: No response for cmd 50 mid 36461
Apr 21 14:32:25 vam-uti1 CIFS VFS: No response for cmd 50 mid 20610
Apr 21 14:32:46 vam-uti1 CIFS VFS: No response for cmd 114 mid 20611
Apr 21 14:33:07 vam-uti1 CIFS VFS: No response for cmd 114 mid 20613
Comment 1 Jodok Ole Muellers 2009-04-22 03:05:16 UTC
Created attachment 4077 [details]
tcpdump -i bond0 -p -s 1500 -C1 -w prod-fs.cap port 445 or port 139 and host 10.2.1.155

tcpdump capture attachted.
Comment 2 Jodok Ole Muellers 2009-04-22 03:06:12 UTC
Created attachment 4078 [details]
output of "strace -fo ls.log ls -1  | wc -l" attached

output of "strace -fo ls.log ls -1  | wc -l" attached
Comment 3 Steve French 2009-05-19 22:17:48 UTC
Is this still reproducible?
Comment 4 Jeff Layton 2009-05-20 05:12:30 UTC
Looks like the server is frequently resetting the connection. Do we have a mechanism to restart the search when a reconnection event occurs? If not, that's what may be needed here.

Of course, if the server is dropping connections as much as this capture shows, that may not help (and performance will suck anyway).
Comment 5 Jodok Ole Muellers 2009-05-20 08:33:30 UTC
Yes, I am still able to reproduce the problem on this machine

Linux vam-uti1 2.6.25-gentoo-r9 #1 SMP Mon Nov 10 10:48:10 CET 2008 i686 Intel(R) Xeon(TM) CPU 2.80GHz GenuineIntel GNU/Linux

filename:       /lib/modules/2.6.25-gentoo-r9/kernel/fs/cifs/cifs.ko
version:        1.52
description:    VFS to access servers complying with the SNIA CIFS Specification e.g. Samba and Windows
license:        GPL
author:         Steve French <sfrench@us.ibm.com>
srcversion:     1B0D147F2ACB24FC739C683
depends:        
vermagic:       2.6.25-gentoo-r9 SMP mod_unload PENTIUM4 
parm:           CIFSMaxBufSize:Network buffer size (not including header). Default: 16384 Range: 8192 to 130048 (int)
parm:           cifs_min_rcv:Network buffers in pool. Default: 4 Range: 1 to 64 (int)
parm:           cifs_min_small:Small network buffers in pool. Default: 30 Range: 2 to 256 (int)
parm:           cifs_max_pending:Simultaneous requests to server. Default: 50 Range: 2 to 256 (int)


Please let me know if you need further information
(tcpdump,logs whatever)

TIA Jodok Ole Muellers
Comment 6 Jeff Layton 2009-07-26 08:42:48 UTC
How about with a newer kernel? Say something ~2.6.30-ish?

Comment 7 Jodok Ole Muellers 2009-07-27 03:49:42 UTC
Hello,

Unfortunately we can't switch to a newer kernel since this is macchine
is our production machine and we do not want to touch it.

We further belive that this problem is caused by our Windows Server
"frequently resetting the connection".

The only problem I see from samba is that cifs.ko 
should report an I/O error if there is an interupption instead
of just aborting.

Thanks
Jodok
Comment 8 Jeff Layton 2012-04-28 00:49:43 UTC
Ok, I'm not sure we can do much here if you're not able to move to a newer
kernel. If you can reproduce this on something more recent, then I'd be
more interested. We've fixed quite a few bugs in the readdir code over the
last several years, so it's possible this is already fixed, but without
some more investigation it's difficult to know for sure.

Please reopen if you have a way to experiment with more recent kernels.