If you try to list the print jobs queuing to print on a server, you don't see any. For example if you send a job to a paused queue, the job may be listed immediately after submission, but will disappear (maybe after refreshing the view) shortly afterwards. I have reproduced this on linux and solaris, with and without cups turned on, and using an XP client, also using the smbclient queu command.
Here is a patch for fixing this bug: Index: printing/printing.c =================================================================== RCS file: /cvsroot/samba/source/printing/printing.c,v retrieving revision 1.139.2.38 diff -u -r1.139.2.38 printing.c --- printing/printing.c 12 Nov 2003 01:51:10 -0000 1.139.2.38 +++ printing/printing.c 13 Nov 2003 14:12:59 -0000 @@ -850,7 +850,7 @@ size_t i; uint qcount; - if (max_reported_jobs < pts->qcount) + if (max_reported_jobs && max_reported_jobs < pts->qcount) pts->qcount = max_reported_jobs; qcount = pts->qcount; @@ -2147,7 +2147,7 @@ len = 0; for( i = 0; i < qcount; i++) { uint32 qjob, qsize, qpage_count, qstatus, qpriority, qtime; - len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, NULL, "ddddddff", + len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, "ddddddff", &qjob, &qsize, &qpage_count,
Looks right. Thanks very much for the p8atch. I'll get this in today.
I have now tried the patch, and the net effect is to crash the smbd process handling the connection when I try to view the queue. [2003/11/18 14:34:33, 0] lib/fault.c:fault_report(36) =============================================================== [2003/11/18 14:34:37, 0] lib/fault.c:fault_report(37) INTERNAL ERROR: Signal 11 in pid 1147 (3.0.0) Please read the appendix Bugs of the Samba HOWTO collection [2003/11/18 14:34:39, 0] lib/fault.c:fault_report(39) =============================================================== [2003/11/18 14:34:40, 0] lib/util.c:smb_panic(1400) PANIC: internal error This is true on plain 3.0.1pre3 as well. If I remove the second part of the patch, it doesn't crash, but I still don't see any print jobs.
Can you get me a backtrace? I can't reproduce this.
I haven't succeeded in getting a core yet or any other debugging, not least because it effectively freezes my machine (a solaris 9 Ultra 5) until the process crashes. I suspect some sort of memory leak, or at least a massive increase in memory usage.
One thing I have just noticed in reproducing the problem, it only crashes if there is something in the queue, eg. the queue is disabled at the server and you have just sent a job.
After some more consideration, the second part of the patch is definitely correct. Also I have been able enumerate queue listing using both the lprng and cups printing backends. I'm guessing this is a solaris only bug. Can you send me your smb.conf and the output from lpq (with 1 or 2 jobs in the queue). Thanks.
a sample lpstat output dcm234-55297 dcl0bjc@wycliffe 488004 Nov 26 12:11 smb.conf file (actually from testparm, names/addresses obscured) [global] workgroup = XXXXXX server string = Samba Server encrypt passwords = No log level = 1 log file = /opt/samba/3.0.1pre3/var/log.%m lpq cache time = 30 printcap name = /opt/samba/3.0.1pre3/lib/printcap dns proxy = No wins server = 333.333.333.333 printer admin = root, xxxxxxx, xxxxxxx hosts allow = .xxx.xxx.xxx, 127., 444.444.444.444 [printers] comment = All Printers path = /tmp/samba printable = Yes browseable = No [print$] path = /opt/samba/print write list = root, xxxxxxx, xxxxxxx
Actually, further testing suggests it only crashes if exactly one job is in the queue (this is using smbclient's queue command), though no jobs are shown if there are 2 or more jobs queuing.
Your lpstat output looks different from what are in the lpq_parse.c comments for SYSV printing: /********************************************************* here is an example of "lpstat -o dcslw" output under sysv dcslw-896 tridge 4712 Dec 20 10:30:30 on dcslw dcslw-897 tridge 4712 Dec 20 10:30:30 being held *********************************************************/
btw....can you get a crash using something other than smbclient ?
The basic format is the same, though some of the spacing/date formats are different. Also reporting of print queues worked successfully with earlier versions of samba. I am pretty sure I have checked the crash from windows, but it will be another week before I can double check.
The crash still happens in 3.0.1rc1, from smbclient or direct from windows. With the --enable-developer --enable-debug flags I get slightly different logging, [2003/12/10 10:03:52, 0] lib/fault.c:fault_report(36) =============================================================== [2003/12/10 10:03:55, 0] lib/fault.c:fault_report(37) INTERNAL ERROR: Signal 11 in pid 24828 (3.0.1rc1) Please read the appendix Bugs of the Samba HOWTO collection [2003/12/10 10:03:56, 0] lib/fault.c:fault_report(39) =============================================================== [2003/12/10 10:03:56, 0] lib/util.c:smb_panic(1383) smb_panic: clobber_region() last called from [lp_servicenumber(4052)] [2003/12/10 10:03:58, 0] lib/util.c:smb_panic(1400) PANIC: internal error I also tried hacking the lpq command by using echos to give the results specified in lpq_parse.c; 2 lines returns nothing, one line crashes.
3.0.1 seems to have stopped crashing, though I still don't get anything when I list print queues.
I tried the patch at http://lists.samba.org/archive/samba-technical/2003-December/033475.html I had trouble with patching the middle file (though the changes there look to be irrelevant to the solaris case anyway), but the bits I did apply successfully (including the patches to the other two files) seem to fix the problem.
I have duplicated this problem on HP-UX v11.11 with LPRng 3.8.21 (running the newly-release Samba 3.0.1; bugzilla would not allow me to change the Samba version above -- I get no output when viewing queues, although the logs say things like this: [2003/12/26 17:03:16, 3] printing/print_generic.c:print_run_command(84) Running the command `/opt/LPRng/bin/lpq -Pcpnoff' gave 0 [2003/12/26 17:03:16, 3] printing/printing.c:print_queue_update(1015) 1 job in queue for cpnoff ...clearly showing 1 job was found. Occaisionally Windows will display the job for a second or less, but normally queue looks empty (even though there are several jobs in it).
Thanks to Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de> for finding the bug. This is a byte order problem in printing.c (which explains why people saw it on sparc solaris and HP-UX). ---------------------------------- (mail from Juergen): I had similar problems like those in Bug #660 with printing on Solaris: - uname -a SunOS bseis 5.8 Generic_108528-22 sun4u sparc SUNW,Sun-Fire-280R - smbd -V Version 3.0.1 - From a level 5 Log: [2004/01/14 11:19:30, 5] ../source/printing/printing.c:get_stored_queue_info(2141) get_stored_queue_info: qcount = 16777216, extra_count = 0 [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(36) =============================================================== [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(37) INTERNAL ERROR: Signal 11 in pid 1357 (3.0.1) Please read the appendix Bugs of the Samba HOWTO collection [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(39) =============================================================== [2004/01/14 11:19:36, 0] ../source/lib/util.c:smb_panic(1400) PANIC: internal error ! 16777216 == 0x0x1000000 so it is a byte order problem. This patch gets rid of this problem. Someone with more knowledge should look for other places, where data is copied from a tdb instead of tdb_unpack(). --- source/printing/printing.c.DIST 2003-12-04 22:38:38.000000000 +0100 +++ source/printing/printing.c 2004-01-14 11:19:26.096621000 +0100 @@ -2126,8 +2126,6 @@ data = tdb_fetch(pdb->tdb, key); - if (data.dptr == NULL || data.dsize < 4) - qcount = 0; - else - memcpy(&qcount, data.dptr, 4); + if (data.dptr && data.dsize >= sizeof(qcount)) + len += tdb_unpack(data.dptr + len, data.dsize - len, "d", &qcount); /* Get the changed jobs list. */ @@ -2149,8 +2147,7 @@ /* Retrieve the linearised queue data. */ - len = 0; for( i = 0; i < qcount; i++) { uint32 qjob, qsize, qpage_count, qstatus, qpriority, qtime; - len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, "ddddddff", + len += tdb_unpack(data.dptr + len, data.dsize - len, "ddddddff", &qjob, &qsize, ---------------------------------- Note that I've also found several more memcpy()'s in printing.c that will need to be fixedbefore 3.0.2rc1
Changes checked into 3.0 and HEAD. Should be fixed in 3.0.2rc1
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.