The Samba-Bugzilla – Bug 7233
print fails with jobs >4GB from Win7 clients
Last modified: 2010-03-30 17:30:02 UTC
Printing from jobs with sizes above 4GB failes with the newest Win7 client. The spool file grows up to 4.1GB and stops growing afterwards. No errors are reported by the windows client and the windows spooler shows full progress. log log level 2
the samba server reports the following.
[2010/03/10 12:06:04, 1] smbd/service.c:1063(make_connection_snum)
w7-dgraf (::ffff:192.168.102.97) connect to service Hold initially as user guest (uid=2009, gid=7000) (pid 20847)
[2010/03/10 12:19:17, 1] smbd/service.c:1240(close_cnum)
w7-dgraf (::ffff:192.168.102.97) closed connection to service Hold
[2010/03/10 12:19:53, 0] lib/util_sock.c:539(read_fd_with_timeout)
[2010/03/10 12:19:53, 0] lib/util_sock.c:1491(get_peer_addr_internal)
getpeername failed. Error was Transport endpoint is not connected
read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
Ups -- this is a cut and paste artefact.
Forgot to mention that the Win7 client is 64bit
I'll need more than a log level2 I'm afraid. Can you get me a wireshark trace of the Win7 client sending a <4Gb file to a Windows printserver, plus a wireshark trace of the same thing against smbd ? I don't need the full log, just the part around the 4gb mark.
Created attachment 5487 [details]
Proposed patch for 3.4.x
Oooh. I think this may be our fault and an obvious flaw.... (sorry).
Can you try this patch. It should apply cleanly to 3.4.x. If you can confirm it works I'll get this into all current branches for the next release.
(In reply to comment #3)
That didn't seem to do the trick. I'll generate a more verbose log. It'd take a while until I find the time to install wireshark on my winbox. Is the log of use without the wireshark protocol ?
So long as it's a debug level 10 around the place where the Win7 client cancels the job, yes.
Created attachment 5500 [details]
level 10 log for the >4GB bug
This log doesn't seem to cover the case I'm interested in. It stops way before the 0xFFFFFFFF offset position which is the interesting problem.
How are you able to reproduce creating a 4+Gb print job ? If you can let me know how you do this in a way I can repeat I should be able to track down and fix this.
The whole log is much larger. I just posted the part around the point where the spool file stopped growing. Didn't know how large an attachment can be. I could post the whole log (356M) or place it on my web server. Or you may tell me which values to look for so I could cut out the interesting part.
The 7.1 GB job is the output of a pdf which is rather small in file size (4MB) but large in pages with one bitmap object referenced on every second page. Unfortunately I'm not allowed to hand out the pdf. The printer driver also seems to be involved in this problem, since not all printer drivers produce this error although our printer driver works fine when connected to a win server. I'll try to give you a setup in which to reproduce this error.
I've placed a 1.5 M pdf file on http://www.oncaphillis.net/luke.pdf that generates >5GB PS when using Acobat Reader 9 (be patient it's a DSL up channel).
The printer drive we are using is the bizhub PRO 1050 3.0.28 which can be found under
So now I actually did a WireShark trace which is quite large although I've started at around 3.8 GB, If someone could tell me how to remove the smb.file_data field from the WriteRequest package it would be significantly larger.
Created attachment 5525 [details]
Wireshark screenshot for two SMB packages on the 4GB border
So I started to work with wireshark and found out that there is indeed an File Offset overflow. See screenshot. Makes me wonder how 4GB file transfer should work at all when the offset value is 4 byte only.
Thanks - that's *great* information. What I expected. The reason it works with Windows is that Windows is ignoring the file offset when it's writing to a print file. I'll make us do the same.
Give me a little time to work on this.
That's great news
Created attachment 5527 [details]
Patch for 3.5.x and master
If running 3.5.x, please try this patch. Should fix it. I'll post a follow-up patch for 3.4.x.
Created attachment 5528 [details]
Patch for 3.4.x.
Unfortunately that didn't do the trick. The spool file of a patched version 3.4.7stops growing exactly at 0x100000000 and I couldn't even get the unpatched version 3.5.1 up and running on my up-to-date fedora 12 it aborts with
#0 /home/kloska/smb351/sbin/smbd(log_stack_trace+0x2d) [0x4e4aff]
#1 /home/kloska/smb351/sbin/smbd(smb_panic+0x78) [0x4e4c4c]
#2 /home/kloska/smb351/sbin/smbd [0x4d4282]
#4 /lib/libc.so.6(abort+0x17a) [0xc8104a]
#5 /usr/lib/libtalloc.so.1 [0xbc931c]
#6 /usr/lib/libtalloc.so.1(talloc_free+0x21b) [0xbcdfcb]
#7 /usr/lib/libwbclient.so.0(wbcFreeMemory+0x21) [0xb06b21]
#8 /usr/lib/libwbclient.so.0(wbcSidToGid+0xde) [0xb0943e]
#9 /home/kloska/smb351/sbin/smbd(winbind_sid_to_gid+0x39) [0x4b7a94]
#10 /home/kloska/smb351/sbin/smbd(sid_to_gid+0x117) [0x4959f0]
#11 /home/kloska/smb351/sbin/smbd(create_local_nt_token+0x321) [0x53d9bd]
#12 /home/kloska/smb351/sbin/smbd(get_root_nt_token+0x153) [0x53df7d]
#13 /home/kloska/smb351/sbin/smbd(svcctl_init_keys+0x2e) [0x3f5118]
#14 /home/kloska/smb351/sbin/smbd(registry_init_full+0x8b) [0x8177cb]
#15 /home/kloska/smb351/sbin/smbd(main+0xa50) [0x8469d1]
#16 /lib/libc.so.6(__libc_start_main+0xe6) [0xc6aa86]
#17 /home/kloska/smb351/sbin/smbd [0x244421]
Please replace libwbclient with the just compiled version.
Are you also running with the patch in:
I think you'll need both for 3.4.x.
Oh - shame !! Installing at a non standard position and not setting LD_LIBRARY_PATH. You may now officially call me an idiot. The patched 3.5.1 seems to handle files above 4GB. I didn't apply the previous patch to the 3.4.6 -- so I'll give it a try tomorrow.
Created attachment 5534 [details]
git-am format patch for 3.5.2.
Fix tested and passed by reporter.
Created attachment 5535 [details]
git-am patch for 3.5.2.
Contains both 66b7fcc8ef6b43c219dcae43a68d861cf71b80fa and 59545276f2bd6df8ee5e75e96e707fd86ff44350 from master.
Replaces previous version that didn't contain both fixes.
Created attachment 5536 [details]
git-am patch for 3.4.x.
Fix for 3.4.x.
Re-assigning to Karolin for inclusion once reviewed.
unfortunately applying both patches to 3.4.x does not solve the problem.
Created attachment 5542 [details]
Fixed git-am patch for 3.4.8.
Sebastian, here is a corrected fix for 3.4.8, which should apply cleanly on top of the 3.4.7 code.
Sorry for the mistake in the previous patch, there was a chunk missing that didn't correctly set the offset in the SMBwrite() reply call. This patch should work for 3.4.x. Please test and confirm and thanks a *lot* for testing these fixes.
Still not fixed. Samba 3.4.7 succeeds in breaking the 4GB barrier, but it seems to corrupt the stream since the PS interpreter chokes about offending commands
that seem to be raw data. Looking at the amount of pages that it prints it looks like its around the 2GB area. Could be a faulty offset calculation with a signed number.
For 3.5.x I only checked that it succeeded in transferring more than 4GB, so I can not confirm the 3.5.x fix too. I try the 3.5.x version and come back to you.
Ok, I'm putting this on hold until we fix the source of the file corruption. I'm going to revert the "always write at EOF offset" changes to master until we know what's wrong. Can you test the patch on 3.5.x please ? I'm guessing it will also fail due to the 2GB corruption but it would help a lot to know.
What would be really helpful is a wireshark trace - only capturing the packet SMB headers so it's not too huge - that allows me to track the offsets + write_amount sent on each print write. That way I can determine the algorithm that Windows uses when the offset wraps at 32-bits.
I had a look at the 3.5.x patch and it seems to work. No data stream corruption At least none that killed the PS interpreter. I'm just pulling the faulty spool file generated under 3.4.x and compare them with the version from 3.5.x to see where in the stream the difference appears. A new filtered wire shark protocol might take some time, but the screen shot I've uploaded gives an overview. These are to packages exactly on the 4GB border. Before that the client transferred data chunks of 16583 bytes and places the last chunk exactly on the 4GB limit. There was nothing in the flags or somewhere else which (as far as my minimal knowledge about the SMB protocol goes) distinguished the packages before.
Ok I'm really puzzled. The 3.4.x patch is functionally identical to the 3.5.2 patch. Are you sure you applied the 3.4.x patch here ?:
All writes to the print spool file should be being done at EOF with this patch (which is what the 3.5.2 patch does). Any chance you can help me track down the difference in the 3.5.x and 3.4.x versions because right now I just can't see it ?
Ok -- I've double-checked and -- yes -- I did apply the right patch. I dived a little bit into the 3.4.7 code and found out that the windows client IS actually seeking within the print stream. That seems to happen more often if a file to print is already available e.g. printing via
copy file \\server\printer
but it also happens from the print spooler. You may easily validate it by printing out the startpos in reply_write. I don't have the slightest idea why it seemed to work with the 3.5.x patch. I attach a patch against a clean 3.4.7 which also includes your stuff. I've only worked on reply_write not reply_writeraw. The basic idea is that although the client may seek around it may only do that within the current 4GB chunk. So we check via fstat64 how large the current file is and interprede startpos as an offset to the current 4GB chunk. This is all very dirty code since I'm not aware about the samba coding style. But you may use it for tseting. It would be nice if someone else could validate this approach.
Created attachment 5558 [details]
patch against samba 3.4.7 that tries to solve the 4GB problem via offset calculation relative to the current 4GB chunk
right now I've only tested it with Vista and successfully transferred 15GB -- no corruption at all. I'll have a look at Win7 tomorrow
Hmmm. This is a really interesting fix.... I'll look into generating a new patch containing this. Thanks a *lot* for your help !
Created attachment 5563 [details]
git-am fix for 3.5.2.
Latest version of this patch for 3.5.x. Sebastian can you please test ? I'll also create a 3.4.x version as well.
Thanks for all the help on this.
Created attachment 5564 [details]
git-am fix for 3.4.x.
Latest version of this patch for 3.4.x. Sebastian can you please test ?
Thanks for all the help on this.
FYI. I have an idea on how to create an smbtorture test so we can probe Windows behavior on this.
both patches seem to work find
Ok, pushed to master. I have a smbtorture test that does not reproduce this with a regular <4gb file - not a print spool file. So I think it's correct to do this only with print files.
I'm not going to recommend this for 3.4.x or 3.5.x at the moment. I want to let it cook a bit in master before recommending applying these patches. Hope that's ok for you guys (you can always hand patch for the time being).