Bug 9942 - Printers cannot be exported if CUPS server has more than a 1000 of them
Summary: Printers cannot be exported if CUPS server has more than a 1000 of them
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: Printing (show other bugs)
Version: unspecified
Hardware: All Linux
: P5 normal
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-10 19:47 UTC by Alex K
Modified: 2014-03-25 09:23 UTC (History)
3 users (show)

See Also:


Attachments
smb.conf (1003 bytes, text/plain)
2013-06-10 19:48 UTC, Alex K
no flags Details
log.smbd (74 bytes, text/plain)
2013-06-10 20:00 UTC, Alex K
no flags Details
Test patch. (4.07 KB, patch)
2014-02-11 00:40 UTC, Jeremy Allison
no flags Details
git-am fix for 4.1.next and 4.0.next (4.75 KB, patch)
2014-02-18 20:15 UTC, Jeremy Allison
asn: review+
Details
fix for 4.1.next, cherry-picked from master (4.72 KB, patch)
2014-02-24 19:22 UTC, David Disseldorp
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alex K 2013-06-10 19:47:51 UTC
This was discovered on Ubuntu Linux x64 12.04 with Samba 3.6.15 and CUPS 1.5.4, but is easily reproducible on Ubuntu 12.10 with pre-installed CUPS 1.6.1 and Samba 3.6.6.

Basically, when there is less than a 1000 printers, cupsaddsmb works normally. When more than 1000 printers is on CUPS server, Samba fails to export new ones with WERR_SEM_TIMEOUT. 

I cannot figure out precisely the number of printers after which Samba starts to fail, but it definitely happens with 2000 printers. 

How to reproduce on Ubuntu 12.10 with default settings:
1. Enable CUPS printing support in /etc/samba/smb.conf
2. Add Windows drivers to CUPS
3. Add 2000 printers to CUPS:
for i in {1..2000}; do lpadmin -p test${i} -v socket://test -m lsb/usr/ghostscript/model/pxlcolor.ppd -L "test printer $i" -u allow:all -E; done
4. Wait 60s for Samba to re-read the list of printers from CUPS
5. Export one of the printers to Samba:
cupsaddsmb -v -h localhost test777

Samba will fail to export it with WERR_SEM_TIMEOUT error. Here is what I see on the screen:
<snip>
Running timed event "tevent_req_timedout" 0x7f4c3caef290
Running timed event "tevent_req_timedout" 0x7f4c3caef7e0
tstream_cli_np_destructor: cli_close failed on pipe \spoolss. Error was NT_STATUS_IO_TIMEOUT
cli_api_pipe failed: NT_STATUS_IO_TIMEOUT
result was WERR_SEM_TIMEOUT
Running timed event "tevent_req_timedout" 0x7f4c3caf0c10

Unable to install Windows 2000 printer driver files (1).
alex@samba-test:~$

Removing half of the printers makes cupsaddsmb work again.

Attached are log.smbd with log level = 10 and my smb.conf
When recording the log, I started Samba, and then tried to do cupsaddsmb -v -h localhost test777.
Comment 1 Alex K 2013-06-10 19:48:09 UTC
Created attachment 8957 [details]
smb.conf
Comment 2 Alex K 2013-06-10 20:00:02 UTC
Created attachment 8958 [details]
log.smbd

Log file did'nt attach the first time, re-adding it as URL. Now it begins at the moment when cupsaddsmb command started.
Comment 3 Alex K 2013-10-09 20:14:30 UTC
Tested on CUPS 1.6.4 and Samba 3.6.19, still the same behavior. 
Are any additional logs needed?
Comment 4 Alex K 2013-11-04 21:14:37 UTC
Here is a link to OVA virtual appliance with Ubuntu 13.04 on-board that reproduces this behavior (1.5Gb, for debug purposes only): 

https://docs.google.com/uc?export=download&id=0B-vV7Qx5rjpEckZvNEhnZHN2eFU

You should be able to open it with VirtualBox, VMWare, or any other software that supports open standard virtual appliances.

1. Login with username alex, password samba. 
2. Wait until cupsd finishes starting. You can check its status by using top: cupsd utilizes 100% CPU for 5-10 minutes, then returns back to normal load levels. Or, enumerae
3. Restart samba to make sure it has actual list of printers: 
sudo restart smbd
4. Try to export a printer with 
cupsaddsmb -v -h localhost test1001
(password is samba)
5. Notice slowness and WERR_SEM_TIMEOUT errors 
6. Delete half of the printers with 
for i in {1001..2000}; do lpadmin -x test1001; done
7. Restart samba with sudo restart smbd
8. Try exporting a printer again:
cupsaddsmb -v -h localhost test111
9. Exporting works this time and is significantly faster.
Comment 5 Alex K 2013-11-04 21:16:27 UTC
Step 2 should read:

2. Wait until cupsd finishes starting. You can check its status by using top:
cupsd utilizes 100% CPU for 5-10 minutes, then returns back to normal load
levels. Or, enumerate printers with lpstat -a. As soon as you're able to enumerate them, CUPS is up.
Comment 6 Jeremy Allison 2014-02-10 22:53:43 UTC
So it's actually an indexing (or lack of index :-) problem.

The new driver installs correctly, then sends the smbd a message
MSG_PRINTER_DRVUPGRADE telling it we've updated a driver.

Unfortunately the smbd then iterates over *all* printers,
looking for ones that are using this new driver. And it
does this every time a new driver is added.. :-(.

The "what driver is this printer using" operation is
very expensive, as it requires reading the entire printer
meta-info from the registry (enumerating all keys).

Might need a cache. I'll check into if this got fixed
in 4.0.x... :-).
Comment 7 Jeremy Allison 2014-02-11 00:40:28 UTC
Created attachment 9674 [details]
Test patch.

Ok, this compiles but isn't tested (I'm hoping you can do this :-).

What it does is forwards the MSG_PRINTER_DRVUPGRADE message from the smbd that is updating the driver to the background lpq daemon, allowing it to be bogged down doing the driver updates and not the smbd the user is connected to.

If you could compile this in and test it I'd appreciate it !

Cheers,

Jeremy.
Comment 8 Alex K 2014-02-11 17:05:11 UTC
I confirm that the patch works for me, I was able to successfully export 5 printers out of 2000, and then use them over SMB with Windows.
Thanks!
Comment 9 Jeremy Allison 2014-02-11 17:06:41 UTC
Woo Hoo ! Great news. Now I'll try and get this into 4.1.next, 4.0.next.

3.6.x is out of maintenance now (other than security fixes).

Thanks a *LOT* for the tests !

Jeremy.
Comment 10 Jeremy Allison 2014-02-18 20:15:04 UTC
Created attachment 9692 [details]
git-am fix for 4.1.next and 4.0.next

Patch that was applied to master. Cherry-picks cleanly into 4.1.next, 4.0.next.

Jeremy.
Comment 11 Andreas Schneider 2014-02-19 08:12:51 UTC
Karolin, please apply the patches to the relevant branches. Thanks!
Comment 12 Karolin Seeger 2014-02-21 10:27:32 UTC
(In reply to comment #11)
> Karolin, please apply the patches to the relevant branches. Thanks!

Pushed to autobuild-v4-0-test.

Patch does not apply on current v4-1-test (librpc/nbt: increase MAX_COMPONENTS limit for nbt_names.):

user@host:/data/git/samba/v4-1-test$ git am 0001-s3-printing-Fix-problem-with-server-taking-too-long-.patch
Applying: s3: printing: Fix problem with server taking too long to respond to a MSG_PRINTER_DRVUPGRADE message.
error: patch failed: source3/printing/nt_printing.c:75
error: source3/printing/nt_printing.c: patch does not apply
error: patch failed: source3/printing/queue_process.c:31
error: source3/printing/queue_process.c: patch does not apply
Patch failed at 0001 s3: printing: Fix problem with server taking too long to respond to a MSG_PRINTER_DRVUPGRADE message.

Re-assigning to Jeremy.
Comment 13 David Disseldorp 2014-02-24 19:20:37 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > Karolin, please apply the patches to the relevant branches. Thanks!
> 
> Pushed to autobuild-v4-0-test.
> 
> Patch does not apply on current v4-1-test (librpc/nbt: increase MAX_COMPONENTS
> limit for nbt_names.):
...

The patch appears to be missing context from d34cd6d0 (printing: Create default architecture directories on init.).

A straight cherry-pick of what's in master looks to be suitable for v4-1-test. Patch to follow...
Comment 14 David Disseldorp 2014-02-24 19:22:03 UTC
Created attachment 9717 [details]
fix for 4.1.next, cherry-picked from master
Comment 15 Jeremy Allison 2014-02-24 22:15:32 UTC
Comment on attachment 9717 [details]
fix for 4.1.next, cherry-picked from master

LGTM. Thanks David.
Comment 16 Jeremy Allison 2014-02-24 22:15:53 UTC
Re-assigning to Karolin for inclusion into 4.1.next.
Comment 17 Karolin Seeger 2014-03-10 14:30:41 UTC
Pushed to autobuild-v4-1-test.
Comment 18 Karolin Seeger 2014-03-25 09:23:10 UTC
Pushed to v4-1-test.
Closing out bug report.

Thanks!