OS: 2.6.18-238.el5 #1 SMP Tue Jan 4 15:41:11 EST 2011 x86_64 x86_64 x86_64 GNU/Linux Smb: 3.6.1 (problems was introduced somewhere near 3.4) Most open files as determined by lsof run on one process ID: root 9194 1 0 Jan24 ? 00:00:00 smbd root 9218 9194 0 Jan24 ? 00:00:00 smbd root 20938 9194 0 18:14 ? 00:00:00 smbd root 28223 9194 0 12:16 ? 00:00:04 smbd jdz 31347 9194 0 13:08 ? 00:03:19 smbd gle3 31995 9194 0 20:50 ? 00:00:02 smbd # lsof -p 31347 smbd 31347 jdz 40r DIR 253,0 4096 220856336 /home/jdz/ws/p104/frontend/runtime smbd 31347 jdz 41r DIR 253,0 4096 201654565 /home/gle3/ws/p104 smbd 31347 jdz 42r DIR 253,0 4096 203849771 /home/jja/Links smbd 31347 jdz 43r DIR 253,0 4096 200769865 /home/jbi/ws/p104/ddl ... The process is currently running as user 'jdz', but handles file requests for users such as 'jdz', 'gle3', 'jja' and 'jbi'. It seems to work ok. Howevers all users have the same home directory mapped on \\192.168.172.26\homes with identical structure. The home directory points to a different folder, such as /home/USERCODE/. All users work with svn and check out the same project on their home directory, located at /home/USERCODE/ws/p104. When user 1 opens test.xlsb on /home/USERCODE/ws/p104, smbstatus shows a lock: Samba version 3.6.1 PID Username Group Machine ------------------------------------------------------------------- 31347 gle3 Domain Users ws35 (192.168.172.16) 31347 ale Domain Users ws35 (192.168.172.16) 31347 psc Domain Users ws35 (192.168.172.16) 31347 tmi Domain Users ws35 (192.168.172.16) 20938 Administrator Administrators ws54 (192.168.172.17) 31347 oracle oinstall ws35 (192.168.172.16) 31347 pho Domain Users ws35 (192.168.172.16) 31347 psc Domain Users ws35 (192.168.172.16) 31347 Administrator Administrators ws35 (192.168.172.16) 31347 hbu Domain Users ws35 (192.168.172.16) 31347 jja Domain Users ws35 (192.168.172.16) 31347 jdz Domain Users ws35 (192.168.172.16) 31347 jle Domain Users ws35 (192.168.172.16) 28223 jja Domain Users ws75 (192.168.175.53) 31995 gle3 Domain Users ws48 (192.168.172.118) 31347 bhu Domain Users ws35 (192.168.172.16) Service pid machine Connected at ------------------------------------------------------- gle3 31347 ws35 Wed Jan 25 20:51:49 2012 bhu 31347 ws35 Wed Jan 25 13:14:52 2012 psc 31347 ws35 Wed Jan 25 14:25:37 2012 Administrator 20938 ws54 Wed Jan 25 18:14:29 2012 oracle 31347 ws35 Wed Jan 25 13:15:14 2012 gle3 31995 ws48 Wed Jan 25 20:50:37 2012 hbu 31347 ws35 Wed Jan 25 13:09:10 2012 Administrator 31347 ws35 Wed Jan 25 14:40:25 2012 jle 31347 ws35 Wed Jan 25 13:44:25 2012 ale 31347 ws35 Wed Jan 25 14:49:35 2012 pho 31347 ws35 Wed Jan 25 13:12:52 2012 jdz 31347 ws35 Wed Jan 25 13:40:45 2012 qbubs-img 31347 ws35 Wed Jan 25 13:41:59 2012 jja 31347 ws35 Wed Jan 25 13:08:54 2012 home_gle 31347 ws35 Wed Jan 25 18:19:25 2012 psc 31347 ws35 Wed Jan 25 13:49:54 2012 jdz 31347 ws35 Wed Jan 25 13:08:46 2012 jja 28223 ws75 Wed Jan 25 12:16:27 2012 tmi 31347 ws35 Wed Jan 25 13:24:38 2012 Locked files: Pid Uid DenyMode Access R/W Oplock SharePath Name Time -------------------------------------------------------------------------------------------------- 31347 10001 DENY_WRITE 0x2019f RDWR NONE /home/gle3 ws/p104/ddl/x.xlsb Wed Jan 25 20:52:30 2012 31347 10001 DENY_WRITE 0x3019f RDWR EXCLUSIVE+BATCH /home/gle3 ws/p104/ddl/~$x.xlsb Wed Jan 25 20:52:30 2012 The uid maps to user gle3 in Active Directory with POSIX extensions. When user 'jbi' tries to open the same file in his directory, he gets an error: x.xlsb is locked for editing by 'another user'. Obviously smb thinks that this file is already locked by user 1, but it is a physically different file. Expectation is that both files can be locked independently. Problem can be reproduced at will. [global] workgroup = DOMAIN realm = DOMAIN.LOCAL security = ads kerberos method=secrets and keytab template shell = /bin/ksh winbind use default domain = true winbind offline logon = false debuglevel=1 password server = ws54 winbind enum groups = yes winbind enum users = yes winbind nested groups = yes winbind separator = + server string = Samba %v interfaces = lo eth0 192.168.172.26/24 passdb backend = tdbsam dns proxy = yes cups options = raw username map = /etc/samba/smbusers [homes] comment = Home Directories browseable = no writable = yes inherit acls = yes delete readonly = yes create mask = 0600 directory mask = 0700 oplocks = yes force create mode = 0600 force directory mode = 0700 valid users = %S,DOMAIN\Administrator,root,DOMAIN\!gle3 force user = %S hide files = /desktop.ini/$RECYCLE.BIN/ The log.smbd on log level 7 gives no hints whatsoever.
Please upload a debug level 10 log of this problem together with a network trace. Please see http://wiki.samba.org/index.php/Capture_Packets and http://wiki.samba.org/index.php/Client_specific_Log.
I have made two log sets. In case 1, the log files were split as follows: log.IP-process ID-username In case 2, the log files were not split and just made as asked: log.IP Case 1 was added since the test was done on terminal server and the Samba Linux server had one process ID for ALL users. In both case 1 and case 2, the following steps were executed: * Stop Samba. * Remove all log files. * Start Samba. * User 1 (gle3) navigates to h: (\\192.168.172.26\homes). * Then folder ws. * Then folder p104 * Then folder ddl. * Then opens x.xlsb. * User 1 rests. * User 2 (jbi) navigates to h: (\\192.168.172.26\homes). * Then folder ws. * Then folder p104 * Then folder ddl. * Then opens x.xlsb. * User 2 gets an error 'locked by another user'. * Stop Samba. * Copy log files to separate folder.
Created attachment 7270 [details] Case 1 - logging split per user in gzipped tar
Created attachment 7271 [details] Case 2 - logging not split per user in gzipped tar
Oh, this is from a terminal server? I am 99% sure that you're sitting on a terminal server problem. \\ip\homes is just the same thing. I would not be surprised if you got a problem with ts crossing user boundaries for this. I could not nail it from the logs yet. Can I ask for one more round? Do the same thing and do a network trace. Please see http://wiki.samba.org/index.php/Capture_Packets for info how to create useful logs. Thanks, Volker
(In reply to comment #5) Hi Volker, I assume you only need the pcap when user 2 tries to open the file? Otherwise it will probably become very big. I have another Windows server here. Is there a way to reconstruct \\ip\homes on that platform with the share not actually pointing to the same physical location? Maybe I can help you there with establishing whether it is a Windows feature or Samba issue. Rgds Guido
(In reply to comment #6) > (In reply to comment #5) > Hi Volker, I assume you only need the pcap when user 2 tries to open the file? > Otherwise it will probably become very big. No problem with that. If bugzilla chokes, send it via some other way. I can live with large files. > I have another Windows server here. Is there a way to reconstruct \\ip\homes on > that platform with the share not actually pointing to the same physical > location? Maybe I can help you there with establishing whether it is a Windows > feature or Samba issue. No, you can't do that with Windows. That's why they never test their clients in this setup. That's why I would not be surprised at all if they had bugs there. Volker
Hmm, makes sense. I have tested two cases: Case 1: * user 1 has x.xlsb open through \\ip\homes\ws\p104\ddl\x.xlsb. * user 2 tries to open \\ip\homes\ws\p104\ddl\x.xlsb from his account * user 2 gets locked by another user error. * pcap.pcap uploaded shows behaviour. Case 2: * user 1 has x.xlsb open through \\ip\homes\ws\p104\ddl\x.xlsb. * user 2 tries to open \\ip\jbi\ws\p104\ddl\x.xlsb where 'jbi' is his account. * x.xlsb opens just fine. Workaround is not to map our h: disk to \\192.168.172.26\homes, but instead to \\192.168.172.26\ACCOUNTCODE. It does not seem a Samba bug. Maybe the documentation can be improved stating somewhere: if on terminal server with multiple accounts and similar file structure, do not use \\xx\homes shares but only accountcode to avoid Windows from thinking there is a locking issue? It had me puzzled for months. As far as I can now go back, around the time we started experiencing it, we migrated several servers. Both Linux upgrade and Windows 2008 R1 -> R2 for terminal server. So when it is not a Samba bug, it might be different behaviour between 2008 R1 and 2008 R2 of Windows.
Created attachment 7272 [details] tethereal port 139 or port 445 of issue.