Bug 644 - Very slow (~100Kb/s) gets on localhost when SO_RCVBUF set
Summary: Very slow (~100Kb/s) gets on localhost when SO_RCVBUF set
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: smbclient (show other bugs)
Version: 3.0.0
Hardware: All Linux
: P2 normal
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-17 15:28 UTC by Robert Hardy
Modified: 2005-11-14 09:29 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Hardy 2003-10-17 15:28:27 UTC
Thanks to rsharpe for assistance in debugging.

In /etc/samba/smb.conf I set:
 socket options = SO_RCVBUF=8192 
Then I issue:
 smbclient -U nonrootuser //127.0.0.1/tmp
and then type:
 get 9MB_test.mov

It will transfer my 9MB test file in around 60seconds on a fast server with 0.00
load.

Note: put commands are unaffected even when different SO_SNDBUF options are set.
Note: This only seems to affect localhost.

I've tried this with several different machines/versions of samba. Problem
exists in at least 2.2.7x, 3.0.0, a few 3.0.0rcs and 3.0.1pre1.

Please note that every current Redhat 9 and Rawhide build of samba has
SO_RCVBUF=8192 in its socket options line. This affects a lot of servers.

My server is a heavily updated Redhat 9 box running Linux 2.4.22.

The same happens for most SO_RCVBUF settings: 65536, 8192. They were
particularlly horrible when SO_RCVBUF=1024 (not sure why that was even allowed
since it was below the minimum specified below...) 

My server has:
> cat /proc/sys/net/ipv4/tcp_rmem
4096    87380   174760

WORKAROUND: The only time a get on localhost works properly is when
SO_RCVBUF=87380 as per the default value above. I didn't test values above
87370. If you want to work around the problem comment out the SO_RCVBUF part of
your socket options line.

While SO_RCVBUF!=87380 a 0.2second delay can be seen on specific smb packets
going through localhost (as seen below.) FYI As SO_RCVBUF -> 1024 the same 0.2s
delay spread from 1 in 5 packets to 1 in 2 packets.
 
    No. Time        Delta Time  Source                Destination          
Protocol Info
    207 24.888784   0.209141    127.0.0.1             127.0.0.1             SMB
     Read AndX Response, FID: 0x102d, 16384 bytes
    208 24.888942   0.000158    127.0.0.1             127.0.0.1             TCP
     44064 > microsoft-ds [ACK] Seq=938496441 Ack=933909745 Win=12288 Len=0
    209 24.888993   0.000051    127.0.0.1             127.0.0.1             NBSS
    NBSS Continuation Message
    210 24.889042   0.000049    127.0.0.1             127.0.0.1             NBSS
    NBSS Continuation Message
    211 24.889162   0.000120    127.0.0.1             127.0.0.1             SMB
     Read AndX Request, FID: 0x102d, 15360 bytes at offset 500736
    212 24.889213   0.000051    127.0.0.1             127.0.0.1             TCP
     microsoft-ds > 44064 [ACK] Seq=933913904 Ack=938496500 Win=12002 Len=0
    213 25.098855   0.209642    127.0.0.1             127.0.0.1             SMB
     Read AndX Response, FID: 0x102d, 15360 bytes
    214 25.099009   0.000154    127.0.0.1             127.0.0.1             TCP
     44064 > microsoft-ds [ACK] Seq=938496500 Ack=933926192 Win=12288 Len=0
    215 25.099053   0.000044    127.0.0.1             127.0.0.1             NBSS
    NBSS Continuation Message
    216 25.099570   0.000517    127.0.0.1             127.0.0.1             SMB
     Read AndX Request, FID: 0x102d, 16384 bytes at offset 516096
    217 25.099673   0.000103    127.0.0.1             127.0.0.1             TCP
     microsoft-ds > 44064 [ACK] Seq=933929327 Ack=938496559 Win=12002 Len=0
    218 25.308861   0.209188    127.0.0.1             127.0.0.1             SMB
     Read AndX Response, FID: 0x102d, 16384 bytes
    219 25.309024   0.000163    127.0.0.1             127.0.0.1             TCP
     44064 > microsoft-ds [ACK] Seq=938496559 Ack=933941615 Win=12288 Len=0
    220 25.309078   0.000054    127.0.0.1             127.0.0.1             NBSS
    NBSS Continuation Message
    221 25.309129   0.000051    127.0.0.1             127.0.0.1             NBSS
    NBSS Continuation Message
    222 25.309250   0.000121    127.0.0.1             127.0.0.1             SMB
     Read AndX Request, FID: 0x102d, 16384 bytes at offset 532480
    223 25.309300   0.000050    127.0.0.1             127.0.0.1             TCP
     microsoft-ds > 44064 [ACK] Seq=933945774 Ack=938496618 Win=12002 Len=0
    224 25.518783   0.209483    127.0.0.1             127.0.0.1             SMB
     Read AndX Response, FID: 0x102d, 16384 bytes

A snipet of an strace smbclient -U user //server/tmp shows odd values (12284) on
the reads w/ SO_RCVBUF=8192 set: 
gettimeofday({1066420471, 315620}, NULL) = 0
send(3, "\0\0\0V\377SMB-\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0\0"..., 90, 0) = 90
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {20, 0})
read(3, "\0\0\0A", 4)                   = 4
read(3, "\377SMB-\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 65) = 65
open("test.mov", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0644) = 6
send(3, "\0\0\0H\377SMB2\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0\0"..., 76, 0) = 76
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {20, 0})
read(3, "\0\0\0\226", 4)                = 4
read(3, "\377SMB2\0\0\0\0\210A\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0\360"..., 150) = 150
brk(0)                                  = 0x8206000
brk(0x8216000)                          = 0x8216000
send(3, "\0\0\0007\377SMB.\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0"..., 59, 0) = 59
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {20, 0})
read(3, "\0\0@;", 4)                    = 4
read(3, "\377SMB.\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 16443) = 16443
send(3, "\0\0\0007\377SMB.\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0"..., 59, 0) = 59
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {19, 790000})
read(3, "\0\0@;", 4)                    = 4
read(3, "\377SMB.\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 16443) = 15353
read(3, "\311u\0(\300\6\222\203B^\1P\211\22\221\256\200E\212R.\213"..., 1090) = 1026
read(3, "\4\300\26\340\v}\1n\200\267\301\373\272\';\347\317\352"..., 64) = 64
send(3, "\0\0\0007\377SMB.\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0"..., 59, 0) = 59
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {19, 780000})
read(3, "\0\0@;", 4)                    = 4
read(3, "\377SMB.\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 16443) = 12284
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4159) = 4095
read(3, "^R\20\232\254\6@,\10\4\22166Yb\326l\203\1\2130YH\4\22\20"..., 64) = 64
send(3, "\0\0\0007\377SMB.\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0"..., 59, 0) = 59
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {19, 780000})
read(3, "\0\0<;", 4)                    = 4
read(3, "\377SMB.\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 15419) = 12284
read(3, "\307cc\27U\330\233X\2707\266\256\17\f\370\206\310\273{"..., 3135) = 3135
write(6, "\0\0001Zmoov\0\0001Rcmov\0\0\0\fdcomzlib\0\0001>"..., 64512) = 64512
send(3, "\0\0\0007\377SMB.\0\0\0\0\10\1\310\0\0\0\0\0\0\0\0\0\0"..., 59, 0) = 59
select(5, [3 4], NULL, NULL, {20, 0})   = 1 (in [3], left {19, 790000})
read(3, "\0\0@;", 4)                    = 4

This applies to many components not sure if smbclient is really the best choice.
Comment 1 John H Terpstra (mail address dead( 2003-10-18 13:47:43 UTC
I have documented this in the Performance Tuning chapter of the HOWTO.
Comment 2 Richard Sharpe 2003-10-18 15:48:24 UTC
I think this looks like a bug in Linux's TCP implementation. I have not seen it
in FreeBSD, although I have not tried it.

What it looks like is the SO_RCVBUF size is set lower than the MTU for the link,
and the TCP connection is not setting window size to reflect the lower SO_RCVBUF.

Also, this is not a very important case, in my view.
Comment 3 Richard Sharpe 2003-10-18 15:57:34 UTC
OK, having tried this on FreeBSD, it looks like FreeBSD has this problem as well
(at least 4.3 has).

Doing more investigation.
Comment 4 Robert Hardy 2003-10-18 16:46:29 UTC
I reduced Severity based on feedback. I'm still guessing at what is appropriate
here... It really depends if you need localhost performance which some do... I
also shorted the summary. I'm not allowed to reduce the priority. 
Comment 5 Gerald (Jerry) Carter (dead mail address) 2004-03-16 11:36:05 UTC
kernel bug and has been documented.  Closing out.
Comment 6 Gerald (Jerry) Carter (dead mail address) 2005-11-14 09:29:29 UTC
database cleanup