Bug 660 - viewing jobs in printer queues broken (segfault on solaris 9)
Summary: viewing jobs in printer queues broken (segfault on solaris 9)
Status: CLOSED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: Printing (show other bugs)
Version: 3.0.0
Hardware: Other other
: P3 normal
Target Milestone: none
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact:
URL:
Keywords:
Depends on:
Blocks: 807
  Show dependency treegraph
 
Reported: 2003-10-21 06:30 UTC by Michael Young
Modified: 2005-08-24 10:19 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Young 2003-10-21 06:30:37 UTC
If you try to list the print jobs queuing to print on a server, you don't see
any.  For example if you send a job to a paused queue, the job may be listed
immediately after submission, but will disappear (maybe after refreshing the
view) shortly afterwards. I have reproduced this on linux and solaris, with and
without cups turned on, and using an XP client, also using the smbclient queu
command.
Comment 1 SATOH Fumiyasu 2003-11-13 06:16:28 UTC
Here is a patch for fixing this bug:

Index: printing/printing.c
===================================================================
RCS file: /cvsroot/samba/source/printing/printing.c,v
retrieving revision 1.139.2.38
diff -u -r1.139.2.38 printing.c
--- printing/printing.c	12 Nov 2003 01:51:10 -0000	1.139.2.38
+++ printing/printing.c	13 Nov 2003 14:12:59 -0000
@@ -850,7 +850,7 @@
 	size_t i;
 	uint qcount;
 
-	if (max_reported_jobs < pts->qcount)
+	if (max_reported_jobs && max_reported_jobs < pts->qcount)
 		pts->qcount = max_reported_jobs;
 	qcount = pts->qcount;
 
@@ -2147,7 +2147,7 @@
 	len = 0;
 	for( i  = 0; i < qcount; i++) {
 		uint32 qjob, qsize, qpage_count, qstatus, qpriority, qtime;
-		len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, NULL, "ddddddff",
+		len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, "ddddddff",
 				&qjob,
 				&qsize,
 				&qpage_count,
Comment 2 Gerald (Jerry) Carter (dead mail address) 2003-11-13 06:44:13 UTC
Looks right.  Thanks very much for the p8atch.  
I'll get this in today.
Comment 3 Michael Young 2003-11-18 09:56:58 UTC
I have now tried the patch, and the net effect is to crash the smbd process
handling the connection when I try to view the queue.
[2003/11/18 14:34:33, 0] lib/fault.c:fault_report(36)
  ===============================================================
[2003/11/18 14:34:37, 0] lib/fault.c:fault_report(37)
  INTERNAL ERROR: Signal 11 in pid 1147 (3.0.0)
  Please read the appendix Bugs of the Samba HOWTO collection
[2003/11/18 14:34:39, 0] lib/fault.c:fault_report(39)
  ===============================================================
[2003/11/18 14:34:40, 0] lib/util.c:smb_panic(1400)
  PANIC: internal error
This is true on plain 3.0.1pre3 as well. If I remove the second part of the
patch, it doesn't crash, but I still don't see any print jobs.
Comment 4 Gerald (Jerry) Carter (dead mail address) 2003-11-21 21:46:03 UTC
Can you get me a backtrace?  I can't reproduce
this.
Comment 5 Michael Young 2003-11-24 02:59:27 UTC
I haven't succeeded in getting a core yet or any other debugging, not least
because it effectively freezes my machine (a solaris 9 Ultra 5) until the
process crashes. I suspect some sort of memory leak, or at least a massive
increase in memory usage.
Comment 6 Michael Young 2003-11-24 09:02:03 UTC
One thing I have just noticed in reproducing the problem, it only crashes if
there is something in the queue, eg. the queue is disabled at the server and you
have just sent a job.
Comment 7 Gerald (Jerry) Carter (dead mail address) 2003-11-25 13:55:47 UTC
After some more consideration, the second part of 
the patch is definitely correct.  Also I have been able
enumerate queue listing using both the lprng and cups 
printing backends.  I'm guessing this is a solaris only bug.

Can you send me your smb.conf and the output from lpq 
(with 1 or 2 jobs in the queue).  Thanks.
Comment 8 Michael Young 2003-11-26 04:43:10 UTC
a sample lpstat output
dcm234-55297           dcl0bjc@wycliffe  488004   Nov 26 12:11

smb.conf file (actually from testparm, names/addresses obscured)
[global]
	workgroup = XXXXXX
	server string = Samba Server
	encrypt passwords = No
	log level = 1
	log file = /opt/samba/3.0.1pre3/var/log.%m
	lpq cache time = 30
	printcap name = /opt/samba/3.0.1pre3/lib/printcap
	dns proxy = No
	wins server = 333.333.333.333
	printer admin = root, xxxxxxx, xxxxxxx
	hosts allow = .xxx.xxx.xxx, 127., 444.444.444.444

[printers]
	comment = All Printers
	path = /tmp/samba
	printable = Yes
	browseable = No

[print$]
	path = /opt/samba/print
	write list = root, xxxxxxx, xxxxxxx
Comment 9 Michael Young 2003-11-26 05:26:01 UTC
Actually, further testing suggests it only crashes if exactly one job is in the
queue (this is using smbclient's queue command), though no jobs are shown if
there are 2 or more jobs queuing.
Comment 10 Gerald (Jerry) Carter (dead mail address) 2003-11-29 20:12:31 UTC
Your lpstat output looks different from what are in the lpq_parse.c 
comments for SYSV printing:

/*********************************************************
here is an example of "lpstat -o dcslw" output under sysv

dcslw-896  tridge  4712   Dec 20 10:30:30 on dcslw
dcslw-897  tridge  4712   Dec 20 10:30:30 being held
*********************************************************/
Comment 11 Gerald (Jerry) Carter (dead mail address) 2003-11-29 20:15:21 UTC
btw....can you get a crash using something other than smbclient ?
Comment 12 Michael Young 2003-11-30 01:16:02 UTC
The basic format is the same, though some of the spacing/date formats are
different. Also reporting of print queues worked successfully with earlier
versions of samba.

I am pretty sure I have checked the crash from windows, but it will be another
week before I can double check.
Comment 13 Michael Young 2003-12-10 02:50:42 UTC
The crash still happens in 3.0.1rc1, from smbclient or direct from windows.
With the --enable-developer --enable-debug flags I get slightly different logging,
[2003/12/10 10:03:52, 0] lib/fault.c:fault_report(36)
  ===============================================================
[2003/12/10 10:03:55, 0] lib/fault.c:fault_report(37)
  INTERNAL ERROR: Signal 11 in pid 24828 (3.0.1rc1)
  Please read the appendix Bugs of the Samba HOWTO collection
[2003/12/10 10:03:56, 0] lib/fault.c:fault_report(39)
  ===============================================================
[2003/12/10 10:03:56, 0] lib/util.c:smb_panic(1383)
  smb_panic: clobber_region() last called from [lp_servicenumber(4052)]
[2003/12/10 10:03:58, 0] lib/util.c:smb_panic(1400)
  PANIC: internal error

I also tried hacking the lpq command by using echos to give the results
specified in lpq_parse.c; 2 lines returns nothing, one line crashes.
Comment 14 Michael Young 2003-12-17 03:49:26 UTC
3.0.1 seems to have stopped crashing, though I still don't get anything when I
list print queues.
Comment 15 Michael Young 2003-12-18 03:42:10 UTC
I tried the patch at
http://lists.samba.org/archive/samba-technical/2003-December/033475.html
I had trouble with patching the middle file (though the changes there look to be
irrelevant to the solaris case anyway), but the bits I did apply successfully
(including the patches to the other two files) seem to fix the problem.
Comment 16 Ryan Novosielski (mail bounces back) 2003-12-26 14:12:12 UTC
I have duplicated this problem on HP-UX v11.11 with LPRng 3.8.21 (running the
newly-release Samba 3.0.1; bugzilla would not allow me to change the Samba
version above -- I get no output when viewing queues, although the logs say
things like this:

[2003/12/26 17:03:16, 3] printing/print_generic.c:print_run_command(84)
  Running the command `/opt/LPRng/bin/lpq -Pcpnoff' gave 0
[2003/12/26 17:03:16, 3] printing/printing.c:print_queue_update(1015)
  1 job in queue for cpnoff

...clearly showing 1 job was found. Occaisionally Windows will display the job
for a second or less, but normally queue looks empty (even though there are
several jobs in it).
Comment 17 Gerald (Jerry) Carter (dead mail address) 2004-01-14 09:25:02 UTC
Thanks to Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de> 
for finding the bug.  This is a byte order problem in printing.c
(which explains why people saw it on sparc solaris and HP-UX).

----------------------------------
(mail from Juergen):

I had similar problems like those in Bug #660 with printing on Solaris:

- uname -a
  SunOS bseis 5.8 Generic_108528-22 sun4u sparc SUNW,Sun-Fire-280R

- smbd -V
  Version 3.0.1

- From a level 5 Log:
  [2004/01/14 11:19:30, 5] ../source/printing/printing.c:get_stored_queue_info(2141)
    get_stored_queue_info: qcount = 16777216, extra_count = 0
  [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(36)
    ===============================================================
  [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(37)
    INTERNAL ERROR: Signal 11 in pid 1357 (3.0.1)
    Please read the appendix Bugs of the Samba HOWTO collection
  [2004/01/14 11:19:36, 0] ../source/lib/fault.c:fault_report(39)
    ===============================================================
  [2004/01/14 11:19:36, 0] ../source/lib/util.c:smb_panic(1400)
    PANIC: internal error

! 16777216 == 0x0x1000000 so it is a byte order problem.

This patch gets rid of this problem. Someone with more knowledge should look for
other places, where data is copied from a tdb instead of tdb_unpack().

--- source/printing/printing.c.DIST	2003-12-04 22:38:38.000000000 +0100
+++ source/printing/printing.c	2004-01-14 11:19:26.096621000 +0100
@@ -2126,8 +2126,6 @@
 	data = tdb_fetch(pdb->tdb, key);
 
-	if (data.dptr == NULL || data.dsize < 4)
-		qcount = 0;
-	else
-		memcpy(&qcount, data.dptr, 4);
+	if (data.dptr && data.dsize >= sizeof(qcount))
+		len += tdb_unpack(data.dptr + len, data.dsize - len, "d", &qcount);
 
 	/* Get the changed jobs list. */
@@ -2149,8 +2147,7 @@
 
 	/* Retrieve the linearised queue data. */
-	len = 0;
 	for( i  = 0; i < qcount; i++) {
 		uint32 qjob, qsize, qpage_count, qstatus, qpriority, qtime;
-		len += tdb_unpack(data.dptr + 4 + len, data.dsize - len, "ddddddff",
+		len += tdb_unpack(data.dptr + len, data.dsize - len, "ddddddff",
 				&qjob,
 				&qsize,
----------------------------------

Note that I've also found several more memcpy()'s in printing.c that 
will need to be fixedbefore 3.0.2rc1
Comment 18 Gerald (Jerry) Carter (dead mail address) 2004-01-14 11:10:51 UTC
Changes checked into 3.0 and HEAD.  Should be fixed in 3.0.2rc1
Comment 19 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:19:48 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.