I have Samba 4.5.7 running on Ubuntu 14.04 x64 print servers, with Windows 7 and Windows 10 being their printing clients. There seem to be an issue with Windows spooling a print job to Samba. The actual spooling process is extremely slow: 50kbps or so. Due to that, 10Mb PostScript job takes 5+ minutes to print. At the same time, if I simply copy a 10Mb file from Windows to Samba's print$ share, it gets copied right away, instantly. Debugging it with log level 10 I noticed that api_rpcTNP complains about every segment of the spooled document: rpc input buffer underflow (parse error?) Looking at the log, I see that Samba is receiving the file in 256Kb segments, and complains about each segment. For some reason, first segment is only 78Kb. For my 15Mb testing print job it made ~60 segments, and there is parse error after each one of them. According to the source code, the error would appear when pkt->frag_length is less than p->in_data.data.length. I'm not a developer and not sure, but it feels like Windows is passing wrong size of the packet fragment? My smb.conf limits max protocol version to SMB2. Here is the smallest log excerpt that I can paste: [2017/03/22 11:57:03.536591, 5, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/srv_pipe.c:1404(api_pipe_request) Requested spoolss rpc service [2017/03/22 11:57:03.536607, 4, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/srv_pipe.c:1429(api_rpcTNP) api_rpcTNP: spoolss op 0x13 - api_rpcTNP: rpc command: SPOOLSS_WRITEPRINTER [2017/03/22 11:57:03.536627, 6, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/srv_pipe.c:1469(api_rpcTNP) api_rpc_cmds[19].fn == 0x7f8d3e1f00f0 [2017/03/22 11:57:03.536679, 1, pid=21933, effective(54604, 5000), real(54604, 0)] ../librpc/ndr/ndr.c:450(ndr_print_function_debug) spoolss_WritePrinter: struct spoolss_WritePrinter in: struct spoolss_WritePrinter handle : * handle: struct policy_handle handle_type : 0x00000000 (0) uuid : 00000010-0000-0000-d258-fdc8ad550000 data : DATA_BLOB length=75949 [0000] 25 21 50 53 2D 41 64 6F 62 65 2D 33 2E 30 0A 25 %!PS-Ado be-3.0.% [0010] 63 75 70 73 4A 6F 62 54 69 63 6B 65 74 3A 20 6A cupsJobT icket: j [0020] 6F 62 2D 68 6F 6C 64 2D 75 6E 74 69 6C 3D 6E 6F ob-hold- until=no <snip 0x1280 lines> [12870] 31 20 5D 20 64 65 66 20 0D 0A 25 25 45 6E 64 53 1 ] def ..%%EndS [12880] 65 74 75 70 0D 0A 0D 0A 75 73 65 72 64 69 63 74 etup.... userdict [12890] 20 62 65 67 69 6E 20 2F 65 68 73 61 76 65 20 73 begin / ehsave s [128A0] 61 76 65 20 64 65 66 20 65 6E 64 0D 0A ave def end.. _data_size : 0x000128ad (75949) [2017/03/22 11:57:03.734747, 6, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/rpc_handles.c:339(find_policy_by_hnd_internal) Found policy hnd[1] [0000] 00 00 00 00 10 00 00 00 00 00 00 00 D2 58 FD C8 ........ .....X.. [0010] AD 55 00 00 .U.. [2017/03/22 11:57:03.734802, 6, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/rpc_handles.c:339(find_policy_by_hnd_internal) Found policy hnd[1] [0000] 00 00 00 00 10 00 00 00 00 00 00 00 D2 58 FD C8 ........ .....X.. [0010] AD 55 00 00 .U.. [2017/03/22 11:57:03.734859, 4, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/spoolss/srv_spoolss_nt.c:490(get_printer_snum) short name:dalek-color [2017/03/22 11:57:03.734890, 10, pid=21933, effective(54604, 5000), real(54604, 0)] ../source3/printing/printing.c:469(print_job_find) print_job_find: looking up job 1 for share dalek-color [2017/03/22 11:57:03.734930, 10, pid=21933, effective(54604, 5000), real(54604, 0)] ../source3/printing/printing.c:497(print_job_find) print_job_find: returning system job -1 for jobid 1. [2017/03/22 11:57:03.735092, 5, pid=21933, effective(54604, 5000), real(54604, 0)] ../source3/printing/notify.c:327(send_spoolss_notify2_msg) send_spoolss_notify2_msg: appending message 0x01/0x16 for printer dalek-color to notify_queue_head [2017/03/22 11:57:03.735136, 1, pid=21933, effective(54604, 5000), real(54604, 0)] ../librpc/ndr/ndr.c:450(ndr_print_function_debug) spoolss_WritePrinter: struct spoolss_WritePrinter out: struct spoolss_WritePrinter num_written : * num_written : 0x000128ad (75949) result : WERR_OK [2017/03/22 11:57:03.735233, 5, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/srv_pipe.c:1496(api_rpcTNP) api_rpcTNP: called spoolss successfully [2017/03/22 11:57:03.735261, 10, pid=21933, effective(54604, 5000), real(54604, 0), class=rpc_srv] ../source3/rpc_server/srv_pipe.c:1501(api_rpcTNP) api_rpcTNP: rpc input buffer underflow (parse error?) [2017/03/22 11:57:03.735278, 10, pid=21933, effective(54604, 5000), real(54604, 0)] ../lib/util/util.c:555(dump_data) [0000] 76 0D 0A 63 76 69 7B 6D 6F 76 65 74 6F 20 73 68 v..cvi{m oveto sh [0010] 6F 77 7D 72 65 70 65 61 74 20 73 68 6F 77 70 61 ow}repea t showpa [0020] 67 65 20 71 75 69 74 7D 7B 63 6C 65 61 72 74 6F ge quit} {clearto <snip 0x11A0 lines> [11A00] 31 20 73 65 74 6C 69 6E 65 6A 6F 69 6E 0D 0A 2F 1 setlin ejoin../ [11A10] 6D 79 73 65 74 75 70 20 5B 20 30 20 2D 37 32 20 mysetup [ 0 -72 [11A20] 36 30 30 20 56 20 2D 37 32 20 36 30 30 20 56 20 600 V -7 2 600 V [11A30] 30 20 36 30 30 2E 30 30 30 39 31 20 37 38 30 2E 0 600.00 091 780. [11A40] 30 30 30 39 31 20 5D 20 64 65 66 20 0D 0A 25 25 00091 ] def ..%% [11A50] 45 6E 64 53 65 74 75 70 0D 0A 0D 0A 75 73 65 72 EndSetup ....user [11A60] 64 69 63 74 20 62 65 67 69 6E 20 2F 65 68 73 61 dict beg in /ehsa [11A70] 76 65 20 73 61 76 65 20 64 65 66 20 65 6E 64 0D ve save def end. [11A80] 0A 00 00 00 AD 28 01 00 .....(.. [2017/03/22 11:57:03.887327, 4, pid=21933, effective(54604, 5000), real(54604, 0)] ../source3/smbd/sec_ctx.c:439(pop_sec_ctx) pop_sec_ctx (54604, 5000) - sec_ctx_stack_ndx = 0 [2017/03/22 11:57:03.887375, 1, pid=21933, effective(54604, 5000), real(54604, 0)] ../librpc/ndr/ndr.c:413(ndr_print_debug) &r: struct ncacn_packet rpc_vers : 0x05 (5) rpc_vers_minor : 0x00 (0) ptype : DCERPC_PKT_RESPONSE (2) pfc_flags : 0x03 (3) 1: DCERPC_PFC_FLAG_FIRST 1: DCERPC_PFC_FLAG_LAST 0: DCERPC_PFC_FLAG_PENDING_CANCEL_OR_HDR_SIGNING 0: DCERPC_PFC_FLAG_CONC_MPX 0: DCERPC_PFC_FLAG_DID_NOT_EXECUTE 0: DCERPC_PFC_FLAG_MAYBE 0: DCERPC_PFC_FLAG_OBJECT_UUID drep: ARRAY(4) [0] : 0x10 (16) [1] : 0x00 (0) [2] : 0x00 (0) [3] : 0x00 (0) frag_length : 0x0020 (32) auth_length : 0x0000 (0) call_id : 0x00000005 (5) u : union dcerpc_payload(case 2) response: struct dcerpc_response alloc_hint : 0x00000008 (8) context_id : 0x0000 (0) cancel_count : 0x00 (0) _pad : DATA_BLOB length=0 stub_and_verifier : DATA_BLOB length=8 [0000] AD 28 01 00 00 00 00 00 .(...... [2017/03/22 11:57:03.887566, 10, pid=21933, effective(54604, 5000), real(54604, 0)] ../source3/rpc_server/rpc_server.c:509(named_pipe_packet_process) Sending 1 fragments in a total of 8 bytes I tested this also with Samba 4.3.12, and with Windows 7 and Windows 10, no difference. I hope you can reproduce it in your environment or maybe advise on which logs I should submit and how. Files of log level 10 size are huge. testparm -sv # Global parameters [global] bind interfaces only = No config backend = file dos charset = CP850 enable core files = Yes interfaces = multicast dns register = Yes netbios aliases = netbios name = PRINT-SANE2-HOT netbios scope = realm = AD.CORP.ACME.COM server services = s3fs, rpc, nbt, wrepl, ldap, cldap, kdc, drepl, winbindd, ntp_signd, kcc, dnsupdate, dns server string = Samba 4.5.6-Debian share backend = classic unix charset = UTF-8 workgroup = ACME browse list = Yes domain master = No enhanced browsing = Yes lm announce = Auto lm interval = 60 local master = No os level = 20 preferred master = Auto allow dns updates = secure only dns forwarder = dns update command = /usr/sbin/samba_dnsupdate machine password timeout = 604800 nsupdate command = /usr/bin/nsupdate -g rndc command = /usr/sbin/rndc spn update command = /usr/sbin/samba_spnupdate mangle prefix = 1 mangling method = hash2 max stat cache size = 256 stat cache = Yes client ldap sasl wrapping = sign ldap admin dn = ldap connection timeout = 2 ldap delete dn = No ldap deref = auto ldap follow referral = Auto ldap group suffix = ldap idmap suffix = ldap machine suffix = ldap page size = 1000 ldap passwd sync = no ldap replication sleep = 1000 ldap server require strong auth = allow_sasl_over_tls ldap ssl = start tls ldap ssl ads = No ldap suffix = ldap timeout = 15 ldap user suffix = lock spin time = 200 oplock break wait time = 0 smb2 leases = Yes debug class = No debug hires timestamp = Yes debug pid = No debug prefix timestamp = No debug uid = No ldap debug level = 0 ldap debug threshold = 10 log file = logging = log level = 2 max log size = 20000 syslog = 1 syslog only = No timestamp logs = Yes abort shutdown script = add group script = add machine script = add user script = add user to group script = allow nt4 crypto = No delete group script = delete user from group script = delete user script = domain logons = No enable privileges = Yes init logon delay = 100 init logon delayed hosts = logon drive = logon home = \\%N\%U logon path = \\%N\%U\profile logon script = reject md5 clients = No set primary group script = shutdown script = add share command = afs token lifetime = 604800 afs username map = allow insecure wide links = No async smb echo handler = No auto services = cache directory = /var/cache/samba change notify = Yes change share command = cluster addresses = clustering = No config file = ctdbd socket = ctdb locktime warn threshold = 0 ctdb timeout = 0 default service = delete share command = homedir map = auto.home kernel change notify = Yes lock directory = /var/run/samba log writeable files on exit = No message command = nbt client socket address = 0.0.0.0 ncalrpc dir = /var/run/samba/ncalrpc NIS homedir = No nmbd bind explicit broadcast = Yes panic action = perfcount module = pid directory = /var/run/samba registry shares = No remote announce = remote browse sync = reset on zero vc = No smbd profiling level = off state directory = /var/lib/samba usershare allow guests = No usershare max shares = 100 usershare owner only = Yes usershare path = /var/lib/samba/usershares usershare prefix allow list = usershare prefix deny list = usershare template share = utmp = No utmp directory = wtmp directory = addport command = addprinter command = cups connection timeout = 30 cups encrypt = No cups server = localhost:631 deleteprinter command = disable spoolss = No enumports command = iprint server = load printers = Yes lpq cache time = 30 os2 driver map = printcap cache time = 60 printcap name = cups show add printer wizard = Yes cldap port = 389 client ipc max protocol = default client ipc min protocol = default client max protocol = default client min protocol = CORE client use spnego = Yes dcerpc endpoint servers = epmapper, wkssvc, rpcecho, samr, netlogon, lsarpc, drsuapi, dssetup, unixinfo, browser, eventlog6, backupkey, dnsserver defer sharing violations = Yes dgram port = 138 disable netbios = Yes enable asu support = No eventlog list = large readwrite = Yes max mux = 50 max ttl = 259200 max wins ttl = 518400 max xmit = 16644 min receivefile size = 0 min wins ttl = 21600 name resolve order = lmhosts wins host bcast nbt port = 137 nt pipe support = Yes nt status support = Yes read raw = Yes rpc big endian = No server max protocol = SMB2 server min protocol = LANMAN1 server multi channel support = No smb2 max credits = 8192 smb2 max read = 8388608 smb2 max trans = 8388608 smb2 max write = 8388608 smb ports = 445 139 svcctl list = time server = No unicode = Yes unix extensions = No use spnego = Yes web port = 901 write raw = Yes algorithmic rid base = 1000 allow dcerpc auth level connect = No allow trusted domains = Yes auth methods = check password script = client ipc signing = default client lanman auth = No client NTLMv2 auth = Yes client plaintext auth = No client schannel = Auto client signing = if_required client use spnego principal = No dedicated keytab file = encrypt passwords = Yes guest account = nobody kerberos method = default kpasswd port = 464 krb5 port = 88 lanman auth = No log nt token command = map to guest = Bad Uid map untrusted to domain = No ntlm auth = No ntp signd socket directory = /var/lib/samba/ntp_signd null passwords = No obey pam restrictions = No old password allowed period = 60 pam password change = No passdb backend = tdbsam passdb expand explicit = No passwd chat = *new*password* %n\n *new*password* %n\n *changed* passwd chat debug = No passwd chat timeout = 2 passwd program = password hash gpg key ids = password server = * preload modules = private dir = /var/lib/samba/private raw NTLMv2 auth = No rename user script = restrict anonymous = 2 root directory = samba kcc command = /usr/sbin/samba_kcc security = ADS server role = auto server schannel = Auto server signing = if_required smb passwd file = /etc/samba/smbpasswd tls cafile = tls/ca.pem tls certfile = tls/cert.pem tls crlfile = tls dh params file = tls enabled = Yes tls keyfile = tls/key.pem tls priority = NORMAL:-VERS-SSL3.0 tls verify peer = as_strict_as_possible unix password sync = No username level = 0 username map = username map cache time = 0 username map script = aio max threads = 100 deadtime = 15 getwd cache = Yes hostname lookups = No keepalive = 300 max disk size = 0 max open files = 16424 max smbd processes = 0 name cache timeout = 660 socket options = TCP_NODELAY use mmap = Yes get quota command = host msdfs = Yes set quota command = create krb5 conf = Yes idmap backend = tdb idmap cache time = 604800 idmap gid = idmap negative cache time = 120 idmap uid = include system krb5 conf = Yes neutralize nt4 emulation = No reject md5 servers = No require strong key = Yes template homedir = /home/%D/%U template shell = /bin/false winbind cache time = 300 winbindd privileged socket directory = /var/lib/samba/winbindd_privileged winbindd socket directory = /var/run/samba/winbindd winbind enum groups = No winbind enum users = No winbind expand groups = 0 winbind max clients = 200 winbind max domain connections = 1 winbind nested groups = Yes winbind normalize names = No winbind nss info = template winbind offline logon = No winbind reconnect delay = 30 winbind refresh tickets = No winbind request timeout = 60 winbind rpc only = No winbind sealed pipes = Yes winbind separator = \ winbind trusted domains only = No winbind use default domain = No dns proxy = Yes wins hook = wins proxy = No wins server = wins support = No spoolssd:prefork_child_min_life = 120 spoolssd:prefork_min_children = 5 rpc_daemon:spoolssd = fork rpc_server:spoolss = external spoolss: architecture = Windows x64 idmap config * : backend = tdb comment = path = administrative share = No browseable = Yes case sensitive = Auto default case = lower delete veto files = No hide dot files = Yes hide files = hide special files = No hide unreadable = No hide unwriteable files = No mangled names = Yes mangling char = ~ map archive = Yes map hidden = No map readonly = yes map system = No preserve case = Yes short preserve case = Yes store dos attributes = No veto files = veto oplock files = blocking locks = Yes csc policy = manual fake oplocks = Yes kernel oplocks = No kernel share modes = Yes level2 oplocks = Yes locking = Yes oplock contention limit = 2 oplocks = Yes posix locking = Yes strict locking = Auto afs share = No available = Yes copy = delete readonly = No dfree cache time = 0 dfree command = directory name cache size = 100 dmapi support = No dont descend = dos filemode = No dos filetime resolution = No dos filetimes = Yes fake directory create times = No follow symlinks = Yes fstype = NTFS include = /etc/samba/print.conf magic output = magic script = postexec = preexec = preexec close = No root postexec = root preexec = /usr/bin/renice +18 -p %d root preexec close = No spotlight = No volume = wide links = No cups options = raw default devmode = Yes force printername = Yes lppause command = lpq command = %p lpresume command = lprm command = max print jobs = 1000 max reported print jobs = 0 printable = No print command = printer name = printing = cups printjob username = %U print notify backchannel = No queuepause command = queueresume command = use client driver = No acl allow execute always = No acl check permissions = Yes acl map full control = Yes durable handles = Yes ea support = No map acl inherit = No nt acl support = Yes profile acls = No access based share enum = No acl group control = No admin users = "@domain admins" create mask = 0744 directory mask = 0755 force create mode = 0000 force directory mode = 0000 force group = force unknown acl user = No force user = guest ok = No guest only = No hosts allow = hosts deny = inherit acls = Yes inherit owner = No inherit permissions = No invalid users = read list = read only = Yes smb encrypt = default valid users = write list = aio read size = 0 aio write behind = aio write size = 0 allocation roundup size = 1048576 block size = 1024 max connections = 0 min print space = 0 strict allocate = No strict rename = No strict sync = No sync always = No use sendfile = No write cache size = 0 msdfs proxy = msdfs root = No msdfs shuffle referrals = No ntvfs handler = unixuid, default vfs objects = [printers] comment = All Printers path = /var/spool/samba browseable = No printable = Yes admin users = "@domain admins" guest ok = Yes [print$] comment = Printer Drivers path = /var/lib/samba/printers admin users = "@domain admins" create mask = 02777 directory mask = 02777 force create mode = 02775 force directory mode = 02775 write list = "@domain admins" [admin$] comment = Admin Share path = /vol/admin browseable = No include = /etc/samba/usershare.conf guest ok = Yes read only = No
Sorry about the slow response here, Alex. Please provide a network trace captured across the print attempt, and also a separate capture across the manual print$ I/O.
Created attachment 13151 [details] server-machine.dump.zip Here is a TCP dump collected on the server (zipped). I press "Print" at about 15:53, the resulting 7Mb print job finishes spooling by 15:57. Please note lots of tiny fragments. For comparison, here is a dump of 8Mb file copy operation to the same server, under print$\x64 share. It takes 2 seconds. I can't attach it due to 5Mb attach limit, so here is a Drive link: https://docs.google.com/uc?export=download&id=0B-vV7Qx5rjpELW5hRnVwRDdEbjQ
Thanks. Will hopefully get a chance to look at this next week.
(In reply to Alex K from comment #2) > Created attachment 13151 [details] > server-machine.dump.zip > > Here is a TCP dump collected on the server (zipped). I press "Print" at > about 15:53, the resulting 7Mb print job finishes spooling by 15:57. > > Please note lots of tiny fragments. > > For comparison, here is a dump of 8Mb file copy operation to the same > server, under print$\x64 share. It takes 2 seconds. > > I can't attach it due to 5Mb attach limit, so here is a Drive link: > https://docs.google.com/uc?export=download&id=0B-vV7Qx5rjpELW5hRnVwRDdEbjQ Both network traces appear to only capture client->server traffic, so it's difficult to work out what's going on.... One thing that does stand out is, as you've pointed out, under spoolss the print job is being transferred as a batch of 4K requests, whereas the regular copy uses (perhaps pipelined) 64K SMB2 I/Os. Aside from that there appear to be some delays around the RpcRemoteFindFirstPrinterChangeNotificationEx requests. You're (gladly) running with "print notify backchannel" disabled, which should ensure that these requests are handled quickly on the server, but perhaps it triggers a slow error path on the client. Could you please provide captures which include both client->server *and* server->client traffic, this will make it a lot easier to see where the high latencies are occurring.
Created attachment 13181 [details] printing.dump.zip Sorry, don't know how I didn't notice that. Here is a dump of a full session. Same file printed, same hosts.
Created attachment 13182 [details] copying.dump.zip Here is a dump of a copy process. I'm copying a 2Mb file so that I could attach it to Bugzilla. It was over in less than 2 seconds.
Created attachment 13184 [details] request latency stats across print job
Created attachment 13185 [details] request latency stats across copy
At a high level, the request latency stats appear pretty consistent. There don't appear to be any huge outliers for the print job. What does stand out is the sheer number of write requests (1940) for the print job. Of those, I count 1884 write requests of size 4280 (includes DCE/RPC + spoolss overhead), which would imply that the client is transferring ~7MB to the print server. @Andreas / Günther: are your aware of any mechanisms that might be available to tell a spoolss client to use larger print job I/Os?
Created attachment 13186 [details] request latency stats across copy (fixed)
Oh hang on a minute. If it's transferring the data using DCE/RPC PDU's then 4280 bytes is the max PDU size DCE will use: In the Samba source we have: #define TSTREAM_SMBXCLI_NP_MAX_BUF_SIZE 4280 So this is a Windows client restriction - the only way to fix this would be for it to not transfer the print data over DCE/RPC. I don't know how to make it do that.
I've found some old discussion about this on the jcifs mailing list, but I can't tell if it is related at all or can be used at all. https://lists.samba.org/archive/jcifs/2010-April/009202.html , this quote in particular: > I've dug in and it's looks to me that there is a significant performance win to using AndX blocks to send large chunks of data. A printer opened via OpenPrinterEx() from a Windows system subsequently writes data in 64KB chunks using "Write AndX". I believe this is a performance win because the remote system itself is allowed to fill up its own local buffer and then do a "slow" transaction with a larger chunk of data. The amount of data going over the wire is the same, but since heavy processing is done only after 64KB is sent (vs. about 4KB with normal Write) there is a net improvement.