After upgrading from 3.0.20b to 3.0.21 now clients (win98/win2k) hang when they simulatnously access the same share and files. There where not change in the configure when building 20b and 21. And there where no change in the smb.conf between the update. The share has a very simple config: [reu] comment = XXXXXXXX path = /u/samba/pc/reu public = no writeable = yes printable = no create mode = 660 directory mode = 770 valid users = @reu force group = reu The first client (win98/win2k) connecting can work as expected. Every additional client hangs when the application tries to open files in the same share too. Accessing the directory listing with the windows explorer is still possible. So the general access to the share seems ok. But accessing files with lock may be the problem. As this is a production server, i'm unfortunately unable to do some deeper debug. I downgraded to 20b and all's ok now. My configure options for building the samba executables: ./configure --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/samba3 --with-privatedir=/etc/samba3 \ --libdir=/etc/samba3 --with-libdir=/etc/samba3 \ --with-logfilebase=/var/log/samba3 --with-lockdir=/var/lock \ --with-piddir=/var/run --with-automount --with-msdfs --with-vfs --enable-cups \ --with-acl-support --with-quotas --with-libsmbclient Running on SuSE Linux 9.0 Kernel 2.4.21-303-smp gcc 3.3.1
That's a bit too little information. To fix this, we do need logfiles. Is it possible to set up a separate machine with 3.0.21 and provide debug level 10 logfiles from the failing case? Thanks, Volker
I'm sorry to not can do so. The applications which cause the problem (and are the only applications we are still running are to complex to install (and not licenseable to ohter client machines). The amount of data needed is to big to transfer and we do no more have a second machine to test. Maybe i'm able to reactivate the 3.0.21 for about an hour and testdrive with to clients and get logfiles. But beware: from previous debug sessions with samba i know, that the only the domain logon by itself generates log entries in the high MB score. As our mail system limits the file sizes by 3MB, we may find an other way to submit the logs.
If you set an 'include = /etc/samba/smb.conf.%I' into your main smb.conf, then you can increase the debug level for individual clients if you create /etc/samba/smb.conf.192.168.1.1 for example with the content debug level = 10 log file = /var/log/samba/log.%m debug hires timestamp = yes you get individual logfiles per client and don't overload the var directory with everything else at debug level 10
Every client has currently it's own logfile. The problem is, that with debuglevel=5 the domain logon by itself produces a logfile > 6MB. Nothing but the system startup/logon is debuged at this time. Running the concerned app, which seems to open/read more than 200 files at starup, produces a logfile ~15 to ~20 MB BEFORE the error occures. If i'm not able to reduce the files read at startup, you will get a logfile about >40 MB per client.
Yes, and where's the problem? :-) I've had to walk through logfiles approaching the Gigabyte limit. The only problem there is that searching for strings needs a powerful CPU..... :-) It can't be disk space and bandwith is cheap these days as well. BTW, bzip2 can *really* shrink samba log files. Volker
Just wanted to say I had the same problem with previous 3.0.21 releases (winXP SP2 clients with all updates and domain member of the samba server). Switching back to 3.0.20b solved the problem. I assume it has something to do with the new oplock system in 3.0.21.
Marc, did you report this? This is the first I'm hearing of it. We fixed one such issue prior to 3.0.21. Can anyone get a backtrace ? or an strace?
Got same logfiles with debuglevel 10. Scenario: 2 x Windows98SE samba 3.0.21, from sourcetarball on SuSE 9.0, kernel 2.4.21-303-smp a application named "RA-Micro". Mostly written in Visual Basic. Accessing files placed on a shared samba network drive: [ra_micro] comment = RA-Micro Verzeichnis path = /u/samba/pc/ra_micro public = no writeable = yes printable = no create mode = 660 directory mode = 770 valid users = @inkasso force group = inkasso Remarks: This is not the only app which causes the problem. But the only one i can handle (user, password, etc.). Other apps sharing files read/write on a network drive are also concerned. Testdrive: First i started the pc named PC055. The system logged on to the samba domain controller and run all desired apps including RA-Micro. Then, after successful startup of PC055, i started PC058. PC058 logged in to the samba domain controller like PC055 before. Some apps started up until RA-Micro has to be launched. Then there was nothing more happened. You can launch the windows explorer and work (slowly) with the open apps. But RA-Micro doesn't startup. In the taskmanager you can see that a process is run but nothing appears on the screen. Then i closed RA-Micro on PC055 (the first one starting up). And a view seconds after RA-Micro on PC058 starts up. This is what you can see in the logfiles. For security concerns: the passwords where changed before the debugging and resetet to the original after finishing :-) I agree, it might be a bug according to the (op)locks. It doesn't matter if the OS is Win98 or Win2k. Win2k isn't able to run RA-Micro seriously on samba. Every file access is complained about "File not found. Abort - Retry". With "retry" you can access the file(s). I didn't trace this down, as therefore it would be to much user interaction on production data.
Hmm, tried to submit the logs as attachment. But they are to big. To whom i should send it?
Easiest would be if you put it on your webserver and sent an URL. If that does not work, send them to vl at sernet dot de, I'll forward them appropriately. But please first bzip2 -9 them. Thanks, Volker
I bziped it. But each log is some bytes over 1MB. I sent to you directly. I preferred this just because to be sure not to be to public with some internal data.
Created attachment 1631 [details] loglevel 5 log w2k3 Logfile containing log about opening a word document test.doc which was already openen on another computer (its_lt_01). This log also contains opening test6.txt which could be saved by both computers when they were open on both computers! (with notepad)
Created attachment 1632 [details] loglevel 5 log its_lt_01 Logfile containing log about an opened word document test.doc which is opened at a later stage by another computer (its-2k3). This log also contains opening test6.txt which could be saved by both computers when they were open on both computers! (with notepad)
Okay, i tested the released 3.0.21 version again and created the attached log files. The difference between my previous problem is that with notepad i can open .txt files on more then 1 computer at the same time. No timeout as happens in with word documents. However, i can save the specific .txt on both computers. That means that somehow there is no write lock.
This looks wrong: [2005/12/15 19:25:09, 0] smbd/oplock.c:request_oplock_break(1052) request_oplock_break: failed when sending a oplock break message to pid 15390 on port 0 for dev = 1605, inode = 26657445, file_id = 1 Error was Invalid argument This is not 3.0.21 code.... smbd/oplock.c 3.0.21 has only 725 lines, the line 1052 you refer to is valid for 3.0.20 code. Can you reproduce this with a running 3.0.21 smbd? And, debug level 5 is not enough, please provide debug level 10 logs. Thanks, Volker
Volker, the beginning of log its_lt_01 is from 3.0.20b. Log with 3.0.21 starts probably around 2005/12/23 13:05 Tonight I try to make a loglevel 10 log.
If you do, please stop smbd, make sure you have 3.0.21 installed, delete all logfiles, start smbd. Then do (and please describe) the steps you do to reproduce the problem. BTW, please also set 'max log size = 0' during your tests, your logfiles seem truncated. And bzip2 -9 them :-) Thanks, Volker
Created attachment 1633 [details] Loglevel 10 Okay, here are the loglevel 10 log files. What I did is the following: * removed all old logfiles * started samba 3.0.21 * synchronised the its_lt_01 laptop in order to access the samba share (/data/temp) where the test files are located * Opened test.doc on its_lt_01 (XP SP2, domain member of ITS domain on samba 3.0.21) * Opened test.doc on its-2k3 (win2k3 server which is a domain member of the ITS domain on samba 3.0.21) * Opened test6.txt on its_lt_01 * Opened test6.txt on its-2k3 * Changed test6.txt on its-2k3 and saved (shouldn't be possible) * Closed test6.txt and reopened it on its_lt_01 and checked the change made on its-2k3 (it actually changed which is weird because its_lt_01 opened it first) * Waited till word timed out on its-2k3 * shutdown samba Both its_lt_01 and its-2k3 were logged in to the ITS domain while samba 3.0.20b was still running. I didn't let them log in again cleanly on 3.0.21 because profiles and synchronisation generates a lot of noise in the logfiles. They simply reconnected when samba 3.0.21 was started.
Created attachment 1634 [details] become_root pair Could you try the attached patch? Thanks, Volker
Created attachment 1635 [details] More become_root/unbecome_root pairs necessary There are more places where this kind of patch is necessary. Please try this new one. Thanks, Volker
Volker, can you please apply these fixes to the HEAD and 3.0 SVN trees. I'd like to see everything in place. Thanks, Jeremy.
Volker, 27 december i will try the patch and report back if that solves the issue.
We have many "Trying to delay for oplocks twice" in the logs and concurrent accesses on files fail for all but the first one with "network error" on the client. Maybe related to this one, I'll try the patch.
Volker, the patch fixed the issue I had with word documents. The .txt documents can still be opened on multiple computers without being locked. I don't know what samba behaviour was before 3.0.21 and it's not realy a problem for me. Thnx for fixing the issue.
Please add the patch to http://usX.samba.org/samba/patches/ der tom
patch applied to all branches now. Will be in 3.0.21a