Hello, I'm running Debian Wheezy (x64) on a couple of servers and Windows 7 (x64) on workstations. One server (server A) is a NAS server running Samba 4.1.17 (from Debian Backports) and NFS 1.2.6 (from Debian). It share the "home" directories to other servers throught NFS and to workstations througt Samba. Samba share configuration is: [homes] comment = Home Directories valid users = %S # Read/Write/Delete browseable = no read only = no writable = yes create mask = 0664 directory mask = 0775 delete readonly = yes # Locking locking = no oplocks = yes # Attributes/ACL inherit permissions = yes map acl inherit = yes nt acl support = yes store dos attributes = yes inherit owner = yes map hidden = no map system = no map archive = no NFS share configuration is: /volumes *(rw,async,fsid=0,no_subtree_check,no_root_squash,crossmnt) All user files are in /volumes/users. Home directory are /volumes/users/<user>. On a workstation running Windows 7, I mount the user "home" directory. I open a text editor to create/edit a text file. On server B, I mount the NFS share with options: "_netdev,rw,noatime,nodiratime,vers=4,hard,intr,timeo=5,actimeo=5,retrans=2,bg,acl". So, I use version 4 of NFS. On server B, I run a "cat" on the text file edited on the workstation in a loop so I'm "always" reading the file throught NFS. On the workstation, I save the file so Samba issue a write to the file. The text editor freeze, the smbd process associated to the workstation exits some 1 or 2 minutes later, another smbd process start a little after and the file is really saved 4 or 5 minutes after the command was issued. The editor comes back to live once the save is done. No relevant info in log file using log level 2. Editing file using vi on server A: no problem. Switching to NFS 3: no problem. We recently upgraded from Samba 3 to Samba 4 and NFS 3 to NFS 4. I spent 2 months to track this issue. It's now easy to reproduce but really hard to find what's going wrong...
Is this really a crash, as the subject line of the bug clearly states?
I have no clear way to answer. But the smbd process are not the same before and after the problem. Here is some lines of log: [2015/07/07 15:20:04.157575, 2, pid=23609, effective(21XXX, 21XXX), real(21XXX, 0)] olivier opened file xxxxxx read=Yes write=No (numopen=1) [2015/07/07 15:20:14.258504, 2, pid=23609, effective(21XXX, 21XXX) real(21XXX, 0)] unix_mode(xxxxxx) inheriting from yyyyyy [2015/07/07 15:20:14.258564, 2, pid=23609, effective(21XXX, 21XXX), real(21XXX, 0)] unix_mode(xxxxxx) inherit mode 40775 [2015/07/07 15:20:15.934795, 2, pid=23609, effective(21XXX, 21XXX), real(21XXX, 0)] olivier closed file xxxxxx (numopen=0) NT_STATUS_OK [2015/07/07 15:24:21.939797, 1, pid=23609, effective(0, 0), real(0, 0)] bs-p007 (ipv4:10.48.X.Y:52753) closed connection to service olivier [2015/07/07 15:24:22.058982, 2, pid=24667, effective(0, 0), real(0, 0)] bs-p007 (ipv4:10.48.X.Y:53366) connect to service olivier initially as user olivier (uid=21XXX, gid=21XXX) (pid 24667) [2015/07/07 15:24:22.060956, 2, pid=24667, effective(21XXX, 21XXX), real(21XXX, 0)] unix_mode(xxxxxx) inheriting from yyyyyy [2015/07/07 15:24:22.060994, 2, pid=24667, effective(21XXX, 21XXX), real(21XXX, 0)] unix_mode(xxxxxx) inherit mode 40775 [2015/07/07 15:24:22.061314, 2, pid=24667, effective(21XXX, 21XXX), real(21XXX, 0)] olivier opened file xxxxxx read=No write=Yes (numopen=1) [2015/07/07 15:24:22.080432, 2, pid=24667, effective(21XXX, 21XXX), real(21XXX, 0)] olivier closed file xxxxxx (numopen=0) NT_STATUS_OK At 15:20:15.934795 the editor freezes. At 15:24:21.939797 the editor is back. The write is done by the second process.
I've tried on another server and no problem. I need to compare each configuration.
I now have a minimal use case with two new virtual machines installed as Debian Wheezy and with the same problem and a clean way to reproduce.
(In reply to Olivier Monaco from comment #4) Can you attach to the smbd that stalls with strace -ttT?
Created attachment 11266 [details] Strace output
I replaced the use of a Windows workstation by the use of smbclient. So now: - server A has samba and nfs shares and is using smbclient to access the samba share. - server B has the nfs mount. My use case: 1) Read a file in loop on server B (while true; do cat a.txt; done) 2) Run smbclient from server A to open share of server A 3) Download the a.txt file 4) Upload the a.txt file Smbclient takes around 30 seconds and end with "NT_STATUS_IO_TIMEOUT opening remote file \a.txt" Log of samba: ... [2015/07/17 08:24:40.235204, 2, pid=11322, effective(21136, 21176), real(21136, 0)] unix_mode(a.txt) inheriting from . [2015/07/17 08:24:40.236024, 2, pid=11322, effective(21136, 21176), real(21136, 0)] unix_mode(a.txt) inherit mode 40775 [2015/07/17 08:25:16.284652, 2, pid=11322, effective(21136, 21176), real(21136, 0)] unix_mode(a.txt) inheriting from . [2015/07/17 08:25:16.285541, 2, pid=11322, effective(21136, 21176), real(21136, 0)] unix_mode(a.txt) inherit mode 40775 [2015/07/17 08:25:16.290464, 2, pid=11322, effective(21136, 21176), real(21136, 0)] olivier opened file a.txt read=Yes write=No (numopen=1) [2015/07/17 08:25:16.293384, 2, pid=11322, effective(21136, 21176), real(21136, 0)] olivier closed file a.txt (numopen=0) NT_STATUS_OK ... Strace output attached. So smbd does not crash. It's a timeout. When using a windows box, there may be some "connection reset" from Windows that stop the smbd process and start a new one.
Problems seems to be linked to the use "mount -o bind" and not NFS. On server A, I have a folder named /volumes/data/test with files to share. I use "mount -o bind" to mount /volumes/data/test to /shares/test and then share this folder. When I remove the "mount -o bind", no problem. If I run the "while..." on server A, no problem. If I write to the same file on server A (without samba), no problem. In samba, sharing /volumes/data/test (the original folder) or /shares/test (the "bind") give the same timeout.