Bug 9422 - large read requests cause server to issue malformed reply
Summary: large read requests cause server to issue malformed reply
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: File services (show other bugs)
Version: 3.6.8
Hardware: All All
: P5 regression
Target Milestone: ---
Assignee: Jeremy Allison
QA Contact: Samba QA Contact
URL:
Keywords:
: 9356 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-11-22 15:17 UTC by Jeff Layton
Modified: 2012-12-03 19:56 UTC (History)
3 users (show)

See Also:


Attachments
pcap file showing malformed ReadAndX reply (4.36 KB, application/vnd.tcpdump.pcap)
2012-11-22 15:17 UTC, Jeff Layton
no flags Details
log while running smbd with -d10 (294.60 KB, text/plain)
2012-11-22 16:27 UTC, Jeff Layton
no flags Details
Patch (1.43 KB, patch)
2012-11-22 20:46 UTC, Volker Lendecke
jlayton: review+
Details
git-am format patch for master and 4.0.0rc.next. (1.20 KB, patch)
2012-11-27 22:59 UTC, Jeremy Allison
vl: review+
jlayton: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Layton 2012-11-22 15:17:06 UTC
Created attachment 8219 [details]
pcap file showing malformed ReadAndX reply

Fadeeva Marina reported the following bug to the linux kernel bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=50841

After reproducing it and looking at traces, I'm fairly certain that this is a regression in recent versions of Samba. Specifically, I was able to reproduce it against this version in Fedora: samba-3.6.8-95.fc17.1.x86_64.

The kernel negotiates a large rsize with the server. We then issue a read request with an odd size and the server sends back a malformed reply.

Attaching a capture that shows the problem -- the malformed frame is #20.

The TCP headers all look fine, but the SMB response just consists of 4 bytes of zeroes. Apparently this is a regression, but I'm not sure when it was introduced.
Comment 1 Jeff Layton 2012-11-22 15:25:44 UTC
Bisecting the bs= value, if I issue the dd command with bs=131012, then it works fine. Larger sizes seem to hit similar problems, but if I issue one much larger (bs=135108 or so) then it works fine. Odd...
Comment 2 Jeff Layton 2012-11-22 16:27:58 UTC
Created attachment 8220 [details]
log while running smbd with -d10
Comment 3 Jeff Layton 2012-11-22 16:31:55 UTC
Excerpt from the log at around the time the read request came in:

is_locked: optimisation - exclusive oplock on file dd_file
strict_lock_default: flavour = POSIX_LOCK brl start=0 len=131013 unlocked for fnum 10909 file dd_file
streams_xattr_pread: offset=0, size=131013
read_file (dd_file): pos = 0, size = 131013, returned 131013
send_file_readX fnum=10909 max=131013 nread=131013
read_fd_with_timeout: blocking read. EOF from client.
receive_smb_raw_talloc failed for client 192.168.1.22 read error = NT_STATUS_END_OF_FILE.
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
Security token: (NULL)
UNIX token of user 0
Primary group is 0 and contains 0 supplementary groups
change_to_root_user: now uid=(0,0) gid=(0,0)

...the NT_STATUS_END_OF_FILE error looks suspect...
Comment 4 Volker Lendecke 2012-11-22 20:46:52 UTC
Created attachment 8221 [details]
Patch

Can you try the attached patch?
Comment 5 Jeff Layton 2012-11-23 02:02:06 UTC
Thanks Volker! Yes, that patch does seem to fix it...
Comment 6 Jeff Layton 2012-11-25 11:47:04 UTC
Comment on attachment 8221 [details]
Patch

I don't know the code well enough to properly review it, but I did test it and it fixed the reported bug.
Comment 7 Jeremy Allison 2012-11-27 19:02:10 UTC
This is a blocker bug for 3.6.x. I think we also need a variant of this fix in master and 4.0.0.

Jeremy.
Comment 8 Björn Jacke 2012-11-27 22:23:46 UTC
(In reply to comment #7)
> This is a blocker bug for 3.6.x. I think we also need a variant of this fix in
> master and 4.0.0.

from bad release date experiences we agreed that there are no blocker bugs in release branches any more. If the patch is there in time it will come in, if not it will come with the next release :-)
Comment 9 Jeremy Allison 2012-11-27 22:29:08 UTC
Well this one is a data corruption regression, so it's kind of bad. Plus we have a patch.

That's why I called it a blocker :-).
Comment 10 Jeremy Allison 2012-11-27 22:59:26 UTC
Created attachment 8233 [details]
git-am format patch for master and 4.0.0rc.next.

Forward port to master and 4.0.0rc.next.
Comment 11 Jeremy Allison 2012-11-27 23:02:22 UTC
Re-assigning to Karolin to get :

https://bugzilla.samba.org/attachment.cgi?id=8221

added to 3.6.next. Once Jeff or Volker have +1'ed the patch for master/4.0.0rc.next we can put it there too.

Jeremy.
Comment 12 Jeff Layton 2012-11-28 10:58:12 UTC
Comment on attachment 8233 [details]
git-am format patch for master and 4.0.0rc.next.

Ok, I think I sort of understand the patch happening. The code was artificially masking off the bits above 0x1FFFF in places where it should not have. This seems like it'll do the right thing then.

That said, this code is really hard to follow with so many layers of macros that add pointless indirection.

smb_len, smb_setlen, and their corresponding _large variants seem like pointless indirection into other macros...
Comment 13 Karolin Seeger 2012-11-29 08:06:06 UTC
Pushed to v3-6-test and autobuild-v4-0-test.
Comment 14 Karolin Seeger 2012-11-29 10:11:17 UTC
(In reply to comment #13)
> Pushed to v3-6-test and autobuild-v4-0-test.

Pushed to v4-0-test.

Seems like it has not been pushed to master yet.
Jeremy, would you like to push it to master?
The bug report can be closed afterwards.

Thanks!
Comment 15 Karolin Seeger 2012-11-30 08:03:08 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > Pushed to v3-6-test and autobuild-v4-0-test.
> 
> Pushed to v4-0-test.
> 
> Seems like it has not been pushed to master yet.
> Jeremy, would you like to push it to master?
> The bug report can be closed afterwards.
> 
> Thanks!

Seeing patch in master (d5693d99b).
Closing out bug report.

Thanks!
Comment 16 Jeremy Allison 2012-12-03 19:56:43 UTC
*** Bug 9356 has been marked as a duplicate of this bug. ***