9030 – smbd hangs when DOS client writing large file

Bug 9030 - smbd hangs when DOS client writing large file

Summary: smbd hangs when DOS client writing large file

Status:	RESOLVED DUPLICATE of bug 5138

Alias:	None

Product:	Samba 3.5
Classification:	Unclassified
Component:	File services (show other bugs)
Version:	3.5.6
Hardware:	x86 Linux

Importance:	P5 normal
Target Milestone:	---
Assignee:	Björn Jacke
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2012-07-06 12:13 UTC by Dmitry Irtegov
Modified:	2015-05-18 12:15 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Last 1000 lines of tcpdump before hang taken on server (92.42 KB, application/octet-stream) 2012-07-11 04:11 UTC, Dmitry Irtegov	no flags	Details
Another tcpdump (this time line buffered) (92.46 KB, application/octet-stream) 2012-07-11 05:05 UTC, Dmitry Irtegov	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dmitry Irtegov 2012-07-06 12:13:21 UTC

I have Debian Sarge running samba 3.5.6 server.
When I try to save disk image using Ghost 2003 client, there are random hangs, after writing 200-500Mb of image file (actual volume of data varies from hang to hang).
I never noticed similar problems with Windows and Linux clients, so maybe this is related to DOS client.  I use Microsoft client for MS-DOS.  Also, I never had similar problems with this client with earlier releases of Samba.
The problem is definitely server-related, because smbd process serving that client hangs and can be killed by kill -9 or kill -SIGSEGV from all methods I tried.  At least, /etc/init.d/samba stop and kill %pid do not work.  

Last lines of smbd log file with loglevel 3 look like this:
----- begin cut ----
[2012/07/06 18:29:28.527551,  3] smbd/process.c:1294(switch_message)
  switch message SMBwritebraw (pid 21750) conn 0xb89f7fa8
[2012/07/06 18:29:28.527649,  3] smbd/reply.c:3822(reply_writebraw)
  reply_writebraw: initial write fnum=11926 start=102924288 num=16584 wrote=16584 sync=0
[2012/07/06 18:29:28.529307,  3] smbd/reply.c:3924(reply_writebraw)
  reply_writebraw: secondart write fnum=11926 start=102924288 num=16184 wrote=32768
[2012/07/06 18:29:28.531419,  3] smbd/process.c:1485(process_smb)
  Transaction 78456 of length 16648 (0 toread)
[2012/07/06 18:29:28.531488,  3] smbd/process.c:1294(switch_message)
  switch message SMBwritebraw (pid 21750) conn 0xb89f7fa8
[2012/07/06 18:29:28.531583,  3] smbd/reply.c:3822(reply_writebraw)
  reply_writebraw: initial write fnum=11926 start=102957056 num=16584 wrote=16584 sync=0
[2012/07/06 18:29:28.533259,  3] smbd/reply.c:3924(reply_writebraw)
  reply_writebraw: secondart write fnum=11926 start=102957056 num=16184 wrote=32768
---- end cut -----

when I try to strace hung process, strace shows that it is blocked in read syscall, but cannot parse read parameters, which makes me think there is some kind of memory corruption.  It looks like this:
--- begin cut ----
root@ws8:/var/log/samba# strace -dip 21750
Process 21750 attached - interrupt to quit
 [wait(0x137f) = 21750]
pid 21750 stopped, [SIGSTOP]
 [wait(0x57f) = 21750]
pid 21750 stopped, [SIGTRAP]
[b74021be] read(11, 
---- end cut ----
according to /proc, file handle 11 of process 21750 is a socket.

I tried to create core by killing the process with SIGSEGV, but couldn't find core file (I searched in process CWD according to /proc and in /var/log/samba/cores).

Comment 1 Björn Jacke 2012-07-06 12:30:08 UTC

just out of curiosity: can you send the output of lspci, especially the output of your network card?

Comment 2 Dmitry Irtegov 2012-07-06 13:01:31 UTC

(In reply to comment #1)
> just out of curiosity: can you send the output of lspci, especially the output
> of your network card?

Client or server?
Server:
07:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01)
	Subsystem: Fujitsu Technology Solutions Device 10a8
	Flags: bus master, fast devsel, latency 0, IRQ 82
	Memory at fdfe0000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at ef80 [size=32]
	Capabilities: [c8] Power Management version 2
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [e0] Express Endpoint, MSI 00
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Comment 3 Dmitry Irtegov 2012-07-06 13:03:27 UTC

(In reply to comment #1)
> just out of curiosity: can you send the output of lspci, especially the output
> of your network card?

Client - it is powered down and I'm not in office now.  It is realtek RTL8112L PCIe Gigabit LAN, but it is connected to 100mbit port switch.

Comment 4 Björn Jacke 2012-07-06 13:08:08 UTC

I mean network card on server side

Comment 5 Björn Jacke 2012-07-06 13:11:40 UTC

ah, I didn't see the previous comment. You might try disabling the tcp checksum offloading of your (server) card. Can you please try and see if you still the the hangs?

Comment 6 Björn Jacke 2012-07-06 13:12:53 UTC

see https://bugzilla.kernel.org/show_bug.cgi?id=15384 for background information.

Comment 7 Dmitry Irtegov 2012-07-06 13:26:58 UTC

(In reply to comment #5)
> ah, I didn't see the previous comment. You might try disabling the tcp checksum
> offloading of your (server) card. Can you please try and see if you still the
> the hangs?
------- begin cut ----------------
[root@vzhn ~]# ethtool -k eth0
Offload parameters for eth0:
Cannot get device tx csum settings: Operation not supported
Cannot get device scatter-gather settings: Operation not supported
Cannot get device tcp segmentation offload settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: off
------------- end cut ----------
Hmm, spurious packet drops can explain the behavior.  
Unifortunately, I do not plan to be in office till Monday, so I could not try your solution before July 9.  Thanks anyway.

Comment 8 Dmitry Irtegov 2012-07-09 03:50:04 UTC

Disabling rx offloading on server ethernet didn't help.
smbd process still hangs with same symptomatics.

Comment 9 Björn Jacke 2012-07-09 12:05:42 UTC

I would be surprised if that is not your issue. tx checksumming on the server side is the issue btw but you already wrote in comment #7 that tx checksum offloading is (reported) to be off. Please make a tcpdump and send attach a pcap of the last 1000 packets before the connection hangs. If possible make the sniff on a 3rd machine using trunking port or a HUB so that we see the real network traffic. If this is too much work, plug in a non-Intel card in the server and see if the problem persists :-)

Comment 10 Björn Jacke 2012-07-09 12:08:18 UTC

"Cannot get device tx csum settings: Operation not supported" might be the reason why it is supported to be "off". It is "on", definetely :-)

Comment 11 Dmitry Irtegov 2012-07-11 04:11:26 UTC

Created attachment 7693 [details]
Last 1000 lines of tcpdump before hang taken on server

10.4.16.10 is server, 10.4.16.252 is client.  You can see that server is not sending ACK for last data segment.

Comment 12 Dmitry Irtegov 2012-07-11 04:28:49 UTC

(In reply to comment #11)
> Created attachment 7693 [details]
> Last 1000 lines of tcpdump before hang taken on server
> 
> 10.4.16.10 is server, 10.4.16.252 is client.  You can see that server is not
> sending ACK for last data segment.

(In reply to comment #9)
> I would be surprised if that is not your issue. tx checksumming on the server
> side is the issue btw but you already wrote in comment #7 that tx checksum
> offloading is (reported) to be off. Please make a tcpdump and send attach a
> pcap of the last 1000 packets before the connection hangs. If possible make the
> sniff on a 3rd machine using trunking port or a HUB so that we see the real
> network traffic. If this is too much work, plug in a non-Intel card in the
> server and see if the problem persists :-)

Ugh, sorry, I was changing settings on wrong NIC :(
Right NIC is eth1 and I tried eth0, and yes, eth1 has both tx and rx offloading turned on.

Also, I created tcpdump (see attachment).  It was created from the server before I turned offloading off, not from promiscuous port, but it seems not to support your theory: server does not even try to send ACK to last data packet.  If it was a checksum issue, we probably should see the attempt to send ACK, right?

Now I will try with tx and rx offloading turned off.

Comment 13 Dmitry Irtegov 2012-07-11 05:05:20 UTC

Created attachment 7695 [details]
Another tcpdump (this time line buffered)

TCPDUMP of network exchange between client and server, last 1000 lines before hang.   This time output is line buffered, so last packets aren't lost when I kill tcpdump.
This dump was taken with all offloading on the NIC turned off:
--------- begin cut -----------
[root@vzhn images]# ethtool -k eth1
Offload parameters for eth1:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: off
--------- end cut -------------

Comment 14 Dmitry Irtegov 2012-07-11 05:16:36 UTC

(In reply to comment #12)
> (In reply to comment #11)
> > Created attachment 7693 [details] [details]
> > Last 1000 lines of tcpdump before hang taken on server
> > 
> > 10.4.16.10 is server, 10.4.16.252 is client.  You can see that server is not
> > sending ACK for last data segment.
> 
> (In reply to comment #9)
> > I would be surprised if that is not your issue. tx checksumming on the server
> > side is the issue btw but you already wrote in comment #7 that tx checksum
> > offloading is (reported) to be off. Please make a tcpdump and send attach a
> > pcap of the last 1000 packets before the connection hangs. If possible make the
> > sniff on a 3rd machine using trunking port or a HUB so that we see the real
> > network traffic. If this is too much work, plug in a non-Intel card in the
> > server and see if the problem persists :-)

Sorry again, first tcpdump was broken, I forgot to force line buffering of the output and last packets were lost.  Second tcpdump is done with line buffering, and now I see I was wrong about server not sending ACK,

I turned off all offloading on the Ethernet card and it does not seem to help.  It hangs again with the exactly same symptoms. 

I think how to arrange testing with another NIC, it is not easy, server and me are busy with other tasks.  Doing TCPDUMP from promiscuous port also is not easy for same reasons.  

Also I've created disk image using Clonezilla (Linux client), and it worked perfectly.  It seems to confirm your theory about wrong TCP checksum because only DOS TCP is affected.  But I've turned off all kinds of offloading and it does not help.

Comment 15 Volker Lendecke 2012-07-11 11:55:58 UTC

Can you try

read raw = no
write raw = no

in your server side smb.conf?

Comment 16 Dmitry Irtegov 2012-07-12 02:59:43 UTC

(In reply to comment #15)
> Can you try
> 
> read raw = no
> write raw = no
> 
> in your server side smb.conf?

I tried.  Nothing changed.  It hangs again, strace shows it is blocked in read from socket.  Do I need to produce verbose samba log and/or tcpdump?

Comment 17 Dmitry Irtegov 2012-07-12 03:06:49 UTC

> I tried.  Nothing changed.  It hangs again, strace shows it is blocked in read
> from socket.  Do I need to produce verbose samba log and/or tcpdump?

What is really weird, DOS client unhangs in minute or two; GHOST displays error message and I can quit to DOS, but network share is not accessible (I see drive letter but cannot DIR it or open files on it).  But smbd process does not unhang, which makes me think there is some kind of problem in how smbd handles timeouts.  
I.e even if there is some problem with my hardware or TCP stack, there is some problem in smbd anyway.

Comment 18 Dmitry Irtegov 2012-07-12 03:27:28 UTC

This is end of strace output with read/write raw=no:

------------------ begin cut -----------------
10:20:12.227445 [b7418348] select(12, [7 9 11], [], NULL, {5, 43683}) = 1 (in [11], left {5, 43000})
10:20:12.229162 [b73db691] gettimeofday({1342063212, 229185}, NULL) = 0
10:20:12.229216 [b74101be] read(11, "\0\0@0", 4) = 4
10:20:12.229272 [b74101be] read(11, "\377SMB\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0D:d\0\301:"..., 16432) = 2896
10:20:12.229346 [b74101be] read(11, "\237`q\344s\371'Wf/0\367\205b%v\311\303\305\361\301\341\226\315M\222\344U,yY2"..., 13536) = 1450
10:20:12.229415 [b74101be] read(11, "j\326e\262Qa\202\251J\0\325/\326/!\362\25\376\6\272\330\365QN\214\10\327\247\213b\351 "..., 12086) = 5800
10:20:12.229881 [b74101be] read(11, "\20?\320$\353R\202\247n\274\2728\35V\245\271\270\36{-\37\3535)\274\313n[, \310\231"..., 6286) = 2900
10:20:12.230199 [b74101be] read(11, "\334}\314 \252\346y\325=2\360\210\260\324\334E\251\351>\217\214\3371\343O\r\34\377\3237\312\370"..., 3386) = 1450
10:20:12.230276 [b74101be] read(11, "\312j\206\243]\245k\320\337\1\304\6^E\353\253\342]\272\375\t|\255iV\367>\200^!\325\304"..., 1936) = 1936
10:20:12.230592 [b73f7171] pwrite64(40, "p\237qG\7\357\320\310\300i\334Q\272\311\270\343\217\224\254*2*\304\215\350*\354\5\326\246\265v"..., 16384, 439238656) = 16384
10:20:12.230751 [b74174c6] writev(11, [{"\0\0\0%\377SMB\v\0\0\0\0\200\3\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0D:"..., 41}], 1) = 41
10:20:12.230851 [b73db691] gettimeofday({1342063212, 230874}, NULL) = 0
10:20:12.230904 [b73db691] gettimeofday({1342063212, 230923}, NULL) = 0
10:20:12.230957 [b7418348] select(12, [7 9 11], [], NULL, {5, 40171}) = 1 (in [11], left {5, 39000})
10:20:12.232557 [b73db691] gettimeofday({1342063212, 232579}, NULL) = 0
10:20:12.232610 [b74101be] read(11, "\0\0@0", 4) = 4
10:20:12.232665 [b74101be] read(11, "\377SMB\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0D:d\0\341:"..., 16432) = 2896
10:20:12.232740 [b74101be] read(11, "\260.?lX\17\235\254\207\205h\213j\3L\7\2500\275 \213\361o\217\7\224.af#\241\33"..., 13536) = 1450
10:20:12.232811 [b74101be] read(11, "@W\257\270]P\3747z\365\"Mo\20\206 '\22G\347\3464\270?\364\25a\202\347\207#\24"..., 12086) = 1450
10:20:12.233142 [b74101be] read(11, 
----------------- end cut -----------------------------------

Comment 19 Dmitry Irtegov 2012-07-12 05:15:53 UTC

Another information: this is probably related to client hardware or NIC driver.   I tried to reproduce the issue booting client in virtual machine (I thoght it is simplest method I could get a complete tcpdump from outside of the boxes), and the issue is not reproduced: to this moment client managed to write ~2Gb of image file without hanging, this is much bigger that it could write on physical machine.

Difference is, physical client uses Realtek 8112L.  In virtual box I've found only emulated Intel NICs, I used Intel PRO/1000 MT desktop (82540EM).

Comment 20 Björn Jacke 2012-07-26 10:01:26 UTC

the test with the virtual nick is invalid because the vmware implementation is not "bug compatible" with the original intel things. feel free to reopen if you can show a network sniff from the non virtual setup from a trunking port or hub.

using intel cards with dos clients is not a good idea because intel does not fix their firmware even though they know the issue and they also don't add workarounds to the driver code like disabling the checksum offloading for the broken cards.

*** This bug has been marked as a duplicate of bug 5183 ***

Comment 21 Björn Jacke 2012-07-26 10:02:04 UTC


*** This bug has been marked as a duplicate of bug 5138 ***

Comment 22 Björn Jacke 2015-05-18 12:15:57 UTC

*** Bug 11275 has been marked as a duplicate of this bug. ***