The Samba-Bugzilla – Bug 6202
explorer hangs with unprivileged windows users
Last modified: 2009-03-26 18:32:47 UTC
Since long I am experiencing strange samba problems:
* an ubuntu 8.04 server with samba 3.0.28.a and kernel 2.6.24-23-server acting purely as a smb server (i386)
* Windows XP SP3 clients with built in SMB client, unprivileged users only.
The problem i experience is:
When browsing the network shares - holding e.g. hundreds of photos - in thumbnail preview mode a lot of data is transferred during browsing. When browsing as an unprivileged windows "user", every once in a while, windows explorer hangs. The sort of hang that only the reset button can resolve. In taskmanager, network transfer drops to zero. Active applications (Firefox) continue to work fine (even a ssh into the server works well), just everything associated with windows explorer (log out, browse folders, shut down explorer via taskmanager...) is impossible. The task is unresponsive and can not be killed with the taskmanager. Just a hard reset helps. When browsing the same shares with a windows-user with administrative rights I do not observe those problems (at least so far). It does not matter if I start explorer-windows as a separate process or not.
On Ubuntu the smb logfile in /var/log/samba just shows that the client disconnected at the same instance that the explorer hang occurs (not e.g. 10 seconds later). No error message whatsoever.
Other users seem to experience the same problems (c.f. my bugreport on ubuntu: http://ubuntuforums.org/showthread.php?p=6910358#post6910358)
Anybody any idea of what is going on here? I consider removing samba and replacing it with NFS, because for me this is really a major bug.
my /etc/samba/smb.conf is (I exlcuded all lines commented out!)
workgroup = HOME_NETWORK
server string = %h server (Samba, Ubuntu)
dns proxy = no
log file = /var/log/samba/log.%m
syslog = 0
panic action = /usr/share/samba/panic-action %d
security = user
encrypt passwords = true
passdb backend = tdbsam
obey pam restrictions = yes
invalid users = root
passwd program = /usr/bin/passwd %u
passwd chat = *Enter\snew\sUNIX\spassword:* %n\n *Retype\snew\sUNIX\spassword:* %n\n *password\supdated\ssuccessfully* .
create mask = 0664
directory mask = 0775
comment = Servers photos NAS
browseable = yes
writable = yes
locking = no
path = /NAS/share-photos
public = no
valid users = @share
force group = share
Please upload a network trace of the hanging process, along with your smb.conf and a debug level 10 log of smbd. For information on how to create network sniffs see http://wiki.samba.org/index.php/Capture_Packets
Created attachment 4001 [details]
the bug reporter's smb.conf file requested in comment #1
Thanks for your reply.
* I attached my smb.conf
* Getting a network trace should be no problem.
* How do I obtain a level 10 log of smbd? I do not see an option in smb.conf to set loglevel. I only have the option "syslog = 0" which is not what you want, I guess. Should I set it to 10?
"debug level = 10"
Beware that the logs get very large, so you might temporarily want to set "max log size = 0".
Thanks, a little bit of googling gave me also a hint to "debug level = 10".
However, I'm running into problems with this.
At the default level my samba server spits out files at 25-36 MBit/s (i.e. 25-36% of my 100 MBit Ethernet connection). If I set loglevel to 10, that drops to 7%, i.e. 7 Mbit/sec.
At this speed I could not reproduce the error over the last hour, unfortunately!
Capturing the tcpdump did not have such an impact on the performance.
I set max-log-size to 1GB which was reached after 10 Minutes, so the logfiles grow at the astonishing rate of 100MB/min...
TCPDump of port 445 also produces some 100MB/min (2million packets in ~30 mins)
TCPDump of port 139 produces hardly anything at all (6 packets in ~30 mins)
Strange, I just noticed that now the ports are swapped, so 139 produced 100MB/min and port 445 6 packets. Are they used randomly?
I'll do my best to provide you with an logfile of such a hang at the highest loglevel that I can reproduce the problem, o.k.? Is there a way to speed up logging at loglevel 10?
Created attachment 4002 [details]
log file of hang at default log-level
I have a log of such a hang at default log-level. It was obtained while copying photos off the samba share at a speed of 50-65 Mbit/sec.
I know it doesn't say much, currently I try at a loglevel of 5.
That 20-liner is the log file at debug level 5? This contains not enough information, sorry. As you mention in your bug report, you seem to be very unhappy with Samba overall, as you want to replace it with NFS. Maybe in your situation this is the better thing.
Closing this bug report as invalid, I don't see a Samba problem here. Please re-open if you have more information.
Created attachment 4003 [details]
log file of hang at log-level-5, excerpt
I obtained a 192MB log file at log-level-5 of the hang.
I shortened it to 400kB to give the beginning and the end and cut out lines in the middle that cover sucessful transfers.
The logfile shows the protocol from a
* /etc/init.d/samba start
* covering a client hang
* to a /etc/init.d/samba stop
I hope this is conclusive
I, the reporter, was able to obtein a level 5 log, so I opened the bug again as instructed by Volker Lendecke
When *exactly* did the process start to hang. I can't really see anything wrong in the logs, Samba seems to run properly.
Are you running any kind of virus checker on your client? Did you try to disable it temporarily?
The process hanged during a file transfer, the hang therefore must have occured after the last successful teransfer or during the the last started file transfer. Next time I'll take note of the filename windows explorer reports during the hang (the copy-window just remains open during the hang).
I do run a virus checker, etrust antivirus from computer associates (on the windows client ;-) ). I'll try it without the virus-checker afterwards!
Indeed, Samba does not produce a hang itself. When I reset and restart my client samba on the server operates as well as before without a restart on the server.
Something must cause the abort of the connection. Is there anything I could try to configure differently in windows TCP-IP or other networking settings? Is there something that I should avoid or set for samba, as a rule-of-thumb?
Currently I try a level-8 log, but so far everything works smoothly, but I'm pretty sure that the next hang is just around the corner...
PS: it also happened with a brand new windows install and also with my old Client with a different motherborad (manufacturer, generation an ethernet chip), so I think it is unlikely that it is hardware-related.
Created attachment 4004 [details]
log file of hang at log-level-8, excerpt
I could reproduce the error at level 8. Lets hope this loglevel is more conclusive.
PS: I'll give it a shot at level 10 next.
Assuming the client process hangs right before
[2009/03/22 19:41:10, 6] smbd/process.c:process_smb(1068)
then there is nothing unusual. 99% this is a client problem. Next step would be a sniff of the problem. For information on how to create proper network traces, please look at
Created attachment 4005 [details]
log file of hang at log-level-10, excerpt
Finally, a log-level-10 error. The hang occured when windows explorer reported it was copying the file EPSN0070.jpg.
I'll try to make a tcpdump on the server next.
Thanx for looking into this, really
With the help of Volker (many thanks indeed!) I was able to analyse wireshark sniffs of the network traffic at the server. Volker suspected a network problem, as there were "lost segments".
As it was a client hang rather than a server hang, I replaced the NForce3 on-board ehternet card of the client with a Netgear 311A PCI Ethernet card and for some time the issues have not appeared. I was able to transfer files reliably like never before.
So, unless further hangs occur, I think it was a network hardware bug, no samba bug.
Thanks for help,