Created attachment 9640 [details] Wireshark trace between client and server Hi, I'm getting a lot of CIFS errors in my system log: Feb 4 23:13:09 RockyUbu kernel: [ 6827.119992] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:09 RockyUbu kernel: [ 6827.709338] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:10 RockyUbu kernel: [ 6828.764272] CIFS VFS: Send error in Flush = -9 Usually 10-20 messages will appear at once like this. This had happened occasionally when I was on Ubuntu 12.04 LTS, but since upgrading to Ubuntu 13.10 they are appearing daily. Modinfo cifs shows cifs.ko at version 2.01, and cifs-utils is at 6.0. The share is mounted via fstab: //rocky/share /media/tm cifs credentials=/home/jeff/.smbcredentials,_netdev,uid=jeff,gid=jeff,file_mode=0777,dir_mode=0777,iocharset=utf8,sec+ntlm 0 0 I was asked to create a wireshark trace -- it's my first time so if it was done incorrectly then please let me know and I'll redo it. The trace is attached, and the times in which the "Flush = -9" error occurs are as follows: Feb 4 23:13:09 RockyUbu kernel: [ 6827.119992] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:09 RockyUbu kernel: [ 6827.709338] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:10 RockyUbu kernel: [ 6828.764272] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:11 RockyUbu kernel: [ 6829.319361] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:11 RockyUbu kernel: [ 6829.321144] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:11 RockyUbu kernel: [ 6829.323277] CIFS VFS: Send error in Flush = -9 Feb 4 23:13:11 RockyUbu kernel: [ 6829.819286] CIFS VFS: Send error in Flush = -9 In case it's relevant, both the Linux and Windows installations are virtual machines on the same Hyper-V server host. Thanks!
Ok, so I think the problem here is related to the fact that older kernels used virtual circuits incorrectly: --------------------------------------8<------------------------------------ 4159 19.305071000 192.168.1.106 -> 192.168.1.105 SMB Write AndX Request, FID: 0x8010, 65536 bytes at offset 1466368 4166 19.305292000 192.168.1.105 -> 192.168.1.106 SMB Write AndX Response, FID: 0x8010, 65536 bytes 4205 19.308942000 192.168.1.106 -> 192.168.1.105 SMB Write AndX Request, FID: 0x8010, 65536 bytes at offset 1531904 4212 19.309332000 192.168.1.105 -> 192.168.1.106 SMB Write AndX Response, FID: 0x8010, 65536 bytes 4252 19.311887000 192.168.1.106 -> 192.168.1.105 SMB Write AndX Request, FID: 0x8010, 65536 bytes at offset 1597440 4253 19.313820000 192.168.1.105 -> 192.168.1.106 SMB [TCP ACKed unseen segment] Write AndX Response, FID: 0x8010, 65536 bytes ...above here, everything is happily chugging along... 4254 19.315560000 192.168.1.106 -> 192.168.1.105 SMB Negotiate Protocol Request 4255 19.318554000 192.168.1.105 -> 192.168.1.106 SMB Negotiate Protocol Response 4256 19.318676000 192.168.1.106 -> 192.168.1.105 SMB Session Setup AndX Request, User: \timemachine 4257 19.320162000 192.168.1.105 -> 192.168.1.106 SMB Session Setup AndX Response 4258 19.320223000 192.168.1.106 -> 192.168.1.105 SMB Tree Connect AndX Request, Path: \\rocky\timemachine 4259 19.322031000 192.168.1.105 -> 192.168.1.106 SMB Tree Connect AndX Response ...at this point, a new session gets set up on a different socket. I don't see the actual TCP handshake in the capture, so I'm unclear on when that socket was established and how. In any case, the session setup request has VCnum == 0. 4305 19.322999000 192.168.1.106 -> 192.168.1.105 SMB Write AndX Request, FID: 0x8010, 65536 bytes at offset 1728512 4318 19.324054000 192.168.1.105 -> 192.168.1.106 SMB Write AndX Response, Error: STATUS_INVALID_HANDLE ...and after that, the server starts (correctly) returning STATUS_INVALID_HANDLE to requests involving that FID. What's not clear to me is why the client decided to use VCnum == 0 on this session setup, but the point is rather moot since current mainline kernels no longer do that. I think the solution is to ensure that all of your kernels that are using cifs mounts have this patch: commit 9ae6cf606a33b0a762798df0fb742848bcc685b5 Author: Jeff Layton <jlayton@redhat.com> Date: Mon Sep 16 11:23:45 2013 -0400 cifs: stop trying to use virtual circuits I'm going to go ahead and close this as FIXED on the assumption that it will be after you apply that patch. Please do reopen the bug if that doesn't fix it and I'll take a closer look.
...and I should note that it may not be sufficient to only patch the client that's seeing these errors. If you have several clients connecting to the server through NAT, then you'll need to patch all of them. See the patch description for details as to why that is.