Bug 9112 - smbd.log is flooded by 'printer_list_get_printer: Failed to fetch record!'
Summary: smbd.log is flooded by 'printer_list_get_printer: Failed to fetch record!'
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: Printing (show other bugs)
Version: 3.6.7
Hardware: x64 Linux
: P5 normal
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
: 9152 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-08-22 13:35 UTC by Marc Muehlfeld
Modified: 2012-10-12 07:37 UTC (History)
4 users (show)

See Also:


Attachments
Level 10 debug log (689.75 KB, application/x-bzip2)
2012-08-22 13:35 UTC, Marc Muehlfeld
no flags Details
gdb output (8.12 KB, text/plain)
2012-08-27 13:37 UTC, Marc Muehlfeld
no flags Details
v3-6-test patch (964 bytes, patch)
2012-08-28 12:19 UTC, Andreas Schneider
ddiss: review+
asn: review? (gd)
Details
Log level 10 message showing printer_list_get_printer problem (71.12 KB, text/plain)
2012-09-15 00:34 UTC, Joseph Tam
no flags Details
Bump printer_list_get_printer message to log level 3 (292 bytes, patch)
2012-09-19 22:34 UTC, Joseph Tam
no flags Details
Increase debug level of debug message. (976 bytes, patch)
2012-09-20 08:25 UTC, Andreas Schneider
gd: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Muehlfeld 2012-08-22 13:35:54 UTC
Created attachment 7808 [details]
Level 10 debug log

We had running a printserver on 3.5.16 without problems. Then I switched to 3.6.7 (also happens with 3.6.6) and since that, the smbd.log is flooded with messages
[2012/08/22 15:24:44.017670,  1] printing/printer_list.c:94(printer_list_get_printer)
  Failed to fetch record!
This message is logged every minute.


When I start smbd, then the log is flooded with over 100 of the following:
[2012/08/22 15:21:24.057353,  0] rpc_client/cli_winreg_spoolss.c:241(winreg_printer_openkey)
  winreg_printer_openkey: Could not open HKLM hive: NT_STATUS_RPC_PROCNUM_OUT_OF_RANGE


I don't know if this matters: The s3 server is running in a test environment connected to an s4 beta6 DC (we plan to migrate to s4 in 3 weeks).


I attached a level 10 debug log of the first start of 3.6.7 (with the lock directory from 3.5.16).


Please let me know if you need any further files/debug logs.
Comment 1 Marc Muehlfeld 2012-08-23 18:50:34 UTC
I saw, that it's not just the smbd.log. The message

[2012/08/23 16:07:33.258953,  1] printing/printer_list.c:94(printer_list_get_printer)
  Failed to fetch record!

is also logged every minute in every machine logfile (log file = /var/log/samba/%m.log). What causes really logfile flooding (i have "log level = 1" in my smb.conf).
Comment 2 Marc Muehlfeld 2012-08-26 19:15:29 UTC
The "Failed to fetch record!" messages every minute in every machine logfile also appears in our production environment (s3 only) after I upgraded our 3.5.17 file-/printserver to 3.6.7.
Comment 3 Andreas Schneider 2012-08-27 09:14:35 UTC
You don't use CUPS right?
Comment 4 Marc Muehlfeld 2012-08-27 10:05:03 UTC
We use LPRng 3.8.Arc4 (self compiled) on Scientific Linux 6.2 (RHEL clone).
Comment 5 Andreas Schneider 2012-08-27 12:35:40 UTC
Could you attach gdb, set a breakpoint in printer_list_get_printer() an get a backtrace. I think we call pcap_printername_ok() at a wrong location.
Comment 6 Marc Muehlfeld 2012-08-27 12:51:31 UTC
Can you please tell me, how I can create it for you? Sorry. I'm not a developer.
Comment 7 Andreas Schneider 2012-08-27 13:08:31 UTC
If you have 'debug pid = yes' in your smb.conf file, you will find the pid of the process in the log file.

So search for the

printing/printer_list.c:94(printer_list_get_printer)
  Failed to fetch record!

and in the line above you should see the pid. Then do

gdb -p <pid>
break printer_list_get_printer
continue

As soon as printer_list_get_printer() is called it will stop, then type

backtrace full

and post the output here. You should do this with debug symbols :)

quit

to leave gdb
Comment 8 Marc Muehlfeld 2012-08-27 13:37:34 UTC
Created attachment 7836 [details]
gdb output

I recompiled with --enable-developer and --enable-debug and followed your steps. Hope this configure options bring the debug symbols.

Find the gdb output attached.
Comment 9 Andreas Schneider 2012-08-28 10:37:06 UTC
This isn't the output I wanted. After attaching gdb and setting the breakpoint (was setting the breakpoint successfull?) did you type 'continue'?

When you attach gdb the program is halted. You need to type continue start it resumes and can hit the breakpoint.

You can also find me in #samba-technical on freenode as gladiac if you need more help.
Comment 10 Andreas Schneider 2012-08-28 12:19:02 UTC
Created attachment 7838 [details]
v3-6-test patch
Comment 11 David Disseldorp 2012-08-29 10:22:39 UTC
Comment on attachment 7838 [details]
v3-6-test patch

Looks good Andreas.
Comment 12 Andreas Schneider 2012-08-29 10:37:01 UTC
Karolin, could you please add this to the next 3.6 release? Thanks!
Comment 13 Andreas Schneider 2012-08-30 07:26:52 UTC
Oh, we did further debugging in a query and Marc tested the patch and confirmed that it is working.
Comment 14 Karolin Seeger 2012-08-31 06:50:28 UTC
Pushed to v3-6-test.
Closing out bug report.

Thanks!
Comment 15 Andreas Schneider 2012-09-11 12:50:07 UTC
*** Bug 9152 has been marked as a duplicate of this bug. ***
Comment 16 Joseph Tam 2012-09-15 00:30:58 UTC
I've applied the patch and it has not seemed to work.  I still get reams of "Failed to fetch record!" messages.  I've supplied log level 10  entries between successive "Failed to fetch record!" log entries so hopefully that will give you enough context to figure out the problem.
Comment 17 Joseph Tam 2012-09-15 00:34:19 UTC
Created attachment 7897 [details]
Log level 10 message showing printer_list_get_printer problem
Comment 18 Andreas Schneider 2012-09-16 18:10:00 UTC
The message is not a bug! We should probably increase just the debug level for that message.
Comment 19 Joseph Tam 2012-09-17 04:37:34 UTC
Thanks, that would work for me.  The sheer volume of these log messages would mislead you into believing something is very wrong.  What are these log entries trying tell to me anyways?
Comment 20 Andreas Schneider 2012-09-17 05:38:31 UTC
It tells you that the printcap cache is empty.
Comment 21 Joseph Tam 2012-09-19 22:34:52 UTC
Created attachment 7909 [details]
Bump printer_list_get_printer message to log level 3


A trivial patch to implement the above suggestion (I don't know what the appropriate log level should be, so I chose 3).  This certainly has calmed down samba syslog entries as this hourly histogram shows

09/18 05:00 [  961] xx
09/18 06:00 [  961] xx
09/18 07:00 [ 2755] xxxxxx
09/18 08:00 [ 1154] xx
09/18 09:00 [ 8720] xxxxxxxxxxxxxxxxxxx
09/18 10:00 [12251] xxxxxxxxxxxxxxxxxxxxxxxxxx
09/18 11:00 [ 3187] xxxxxx
09/18 12:00 [ 9049] xxxxxxxxxxxxxxxxxxx
09/18 13:00 [ 6830] xxxxxxxxxxxxxx
09/18 14:00 [ 6403] xxxxxxxxxxxxx
09/18 15:00 [12913] xxxxxxxxxxxxxxxxxxxxxxxxxxxx
09/18 16:00 [ 5270] xxxxxxxxxxx
09/18 17:00 [ 1149] xx
09/18 18:00 [ 1689] xxx
09/18 19:00 [ 1713] xxx
09/18 20:00 [  822] x
09/18 21:00 [  642] x
09/18 22:00 [ 3423] xxxxxxx
09/18 23:00 [  887] x
09/19 00:00 [    0] 
<patched during the following hour>
09/19 01:00 [   21] 
09/19 02:00 [    3] 
09/19 03:00 [    0] 
09/19 04:00 [    0] 
09/19 05:00 [    0]
Comment 22 Andreas Schneider 2012-09-20 08:25:07 UTC
Created attachment 7910 [details]
Increase debug level of debug message.

Followup patch to silence the log by increasing the debug level of the message telling you that the database doesn't contain a printer.
Comment 23 Guenther Deschner 2012-09-20 10:07:42 UTC
Comment on attachment 7910 [details]
Increase debug level of debug message.

looks good.
Comment 24 Andreas Schneider 2012-10-11 12:54:48 UTC
Karolin, could you please add the last patch which increases the debuglevel too. Thanks!
Comment 25 Karolin Seeger 2012-10-12 07:37:07 UTC
Pushed to v3-6-test and cherry-picked from master to autobuild-v4-0-test.
Closing out bug report.

Thanks!