Bug 9126 - Samba 3.6.5 smbd crash on AIX if any printers are defined
Samba 3.6.5 smbd crash on AIX if any printers are defined
Status: NEW
Product: Samba 3.6
Classification: Unclassified
Component: Printing
3.6.5
All AIX
: P5 normal
: ---
Assigned To: printing-maintainers
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-29 18:21 UTC by Ben Lentz
Modified: 2012-10-12 14:48 UTC (History)
1 user (show)

See Also:


Attachments
the full backtrace (134.48 KB, text/plain)
2012-08-30 04:58 UTC, Ben Lentz
no flags Details
the full backtrace (1.09 KB, text/plain)
2012-08-30 05:05 UTC, Ben Lentz
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Lentz 2012-08-29 18:21:05 UTC
We are using samba version 3.6.5 compiled using gcc on the AIX platform.

We're finding that while running on an AIX system (AIX version 6.1 technology level 6 Service Pack 8) that the smbd daemon will crash if any printers are defined at the operating system level.

If we remove all printers from the AIX system, smbd starts and runs fine.

As soon as we defined one printer, smbd crashes and drops core.

Here's smbd -FSd 10 output:

regdb_close: decrementing refcount (3->2)
     winreg_CloseKey: struct winreg_CloseKey
        out: struct winreg_CloseKey
            handle                   : *
                handle: struct policy_handle
                    handle_type              : 0x00000000 (0)
                    uuid                     : 00000000-0000-0000-0000-000000000000
            result                   : WERR_OK
reloading printcap cache
Locking key 5052494E5445524C4953
Allocated locked data 0x20364ed8
Unlocking key 5052494E5445524C4953
reloading aix printcap cache
===============================================================
INTERNAL ERROR: Signal 11 in pid 17891328 (3.6.5)
Please read the Trouble-Shooting section of the Samba3-HOWTO

From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
===============================================================
PANIC (pid 17891328): internal error
unable to produce a stack trace on this platform
dumping core in /opt/local/samba/var/log/cores/smbd

Here's a dbx stack trace:

$ sudo dbx /opt/local/samba/sbin/smbd /opt/local/samba/var/log/cores/smbd/core
Type 'help' for help.
[using memory image in /opt/local/samba/var/log/cores/smbd/core]
reading symbolic information ...warning: no source compiled with -g


IOT/Abort trap in pthread_kill at 0xd0507980 ($t1)
0xd0507980 (pthread_kill+0xa0) 80410014         lwz   r2,0x14(r1)
(dbx) where
pthread_kill(??, ??) at 0xd0507980
_p_raise(??) at 0xd0506de8
raise.raise(??) at 0xd0137280
abort() at 0xd01c3dc4
dump_core() at 0x1075cfbc
smb_panic(??) at 0x10008c64
sig_fault(??) at 0x1075d58c
strlen() at 0xd011d800
tdb_pack_va(??, ??, ??, ??) at 0x100a5984
tdb_pack(??, ??, ??) at 0x100a5b10
printer_list_set_printer(??, ??, ??, ??, ??) at 0x1051a740
pcap_cache_add(??, ??, ??) at 0x105196bc
aix_cache_reload() at 0x1051af78
pcap_cache_reload(??, ??, ??) at 0x1051941c
main(??, ??) at 0x10002468
(dbx)

Are there any quick fixes or workarounds to this bug? We do not use any of the Samba printing functionality (we use only the file server) however we need to be able to define and use AIX print queues in order for our system to be functional.
Comment 1 Ben Lentz 2012-08-29 18:53:34 UTC
According to the man page for smbd.conf, printcap name section:

                  Note
                  Under AIX the default printcap name is /etc/qconfig.
                  Samba will assume the file is in AIX qconfig format
                  if the string qconfig appears in the printcap
                  filename.

If I take one of my problematic systems and set:

printcap name = /etc/qconfig

... it will crash.

If I put in no "printcap name" line at all on my AIX system

... it will crash.

If I put in:

printcap name = /dev/null

... it runs fine.

If I put in:

printcap name = lpstat

... it runs fine.

So... I'm guessing there is a parsing bug in samba 3.6.5 on AIX where the /etc/qconfig file causes smbd to crash.
Comment 2 Ben Lentz 2012-08-29 21:59:01 UTC
Seems to be crashing around ./printing/printer_list.c's printer_list_set_printer() function during the call to tdb_pack

########################

        time_64 = last_refresh;
        time_l = time_64 & 0xFFFFFFFFL;
        time_h = time_64 >> 32;
        len = tdb_pack(NULL, 0, PL_DATA_FORMAT, time_h, time_l, name, str, str2);

printf("a\n");
        data.dptr = talloc_array(key, uint8_t, len);
        if (!data.dptr) {
                DEBUG(0, ("Failed to allocate tdb data buffer!\n"));
                status = NT_STATUS_NO_MEMORY;
                goto done;
        }
        data.dsize = len;

########################

Allocated locked data 0x200d8848
Unlocking key 5052494E5445524C4953542F474C4F42414C2F4C4153545F5245465245534800
reloading aix printcap cache
case 0
name: poascii_dev
case 1
stuff
things
pcap_cache_add(poascii_dev, NULL, NULL)
tdb_pack_va(ddPPP, 0) -> 22
a
===============================================================
INTERNAL ERROR: Signal 11 in pid 237576 (3.6.5)
Please read the Trouble-Shooting section of the Samba3-HOWTO

From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
===============================================================
PANIC (pid 237576): internal error
unable to produce a stack trace on this platform
dumping core in /opt/local/samba/var/log/cores/smbd
Comment 3 David Disseldorp 2012-08-29 22:35:24 UTC
Reminds me of https://bugzilla.samba.org/show_bug.cgi?id=8762 , but your version should be fixed.
Please install debug symbols and attach a full backtrace. I'd like to see the values of name, str and str2.
Comment 4 Ben Lentz 2012-08-30 04:17:45 UTC
What do you mean by "install debug symbols" and "full backtrace"? dbx - where is the only way I've ever done a backtrace on the AIX platform, and unlike Linux systems, we don't have a debug package since this was compiled from source.

I'm going to assume you want it compiled with -g and rebuild.
Comment 5 Ben Lentz 2012-08-30 04:38:31 UTC
Here's the same dbx output, with AIX fullcore enabled, no core ulimits for the root user, and with CC="gcc -g" at compile-time.

$ sudo dbx /opt/local/samba/sbin/smbd /opt/local/samba/var/log/cores/smbd/core
Type 'help' for help.
[using memory image in /opt/local/samba/var/log/cores/smbd/core]
reading symbolic information ...

IOT/Abort trap in pthread_kill at 0xd01250d4 ($t1)
0xd01250d4 (pthread_kill+0x88) 80410014         lwz   r2,0x14(r1)
(dbx) where
pthread_kill(??, ??) at 0xd01250d4
_p_raise(??) at 0xd0124b44
raise.raise(??) at 0xd038c1e8
abort.abort() at 0xd0401210
dump_core() at 0x1075d3b0
smb_panic(??) at 0x10008c8c
sig_fault(??) at 0x1075d980
noname.strlen() at 0xd0371400
tdb_pack_va(??, ??, ??, ??) at 0x100a5cac
tdb_pack(??, ??, ??) at 0x100a5e38
printer_list_set_printer(??, ??, ??, ??, ??) at 0x1051aa74
pcap_cache_add(??, ??, ??) at 0x105199e4
aix_cache_reload() at 0x1051b42c
pcap_cache_reload(??, ??, ??) at 0x10519744
main(??, ??) at 0x10002490
Comment 6 Ben Lentz 2012-08-30 04:46:21 UTC
Variables name, str and str2 are assigned "poascii_dev", NULL, and NULL respectively.
Comment 7 Ben Lentz 2012-08-30 04:58:15 UTC
Created attachment 7849 [details]
the full backtrace
Comment 8 Ben Lentz 2012-08-30 05:05:34 UTC
Created attachment 7850 [details]
the full backtrace
Comment 9 Andreas Schneider 2012-10-11 12:44:11 UTC
Well location and comment are set to NULL, but in printer_list_set_printer() the location is pointing to invalid memory. This looks like a memory corruption. Is valgrind available for AIX?
Comment 10 Ben Lentz 2012-10-12 01:34:41 UTC
According to http://valgrind.org/info/platforms.html...

"There are many platforms not mentioned here. Some are of little interest (eg. SPARC/*, */AIX)."

... I'm guessing it's not?
Comment 11 Andreas Schneider 2012-10-12 06:38:41 UTC
Is there another memchecker available for AIX?
Comment 12 Andreas Schneider 2012-10-12 06:46:34 UTC
How does /etc/qconfig look like?
Comment 13 Ben Lentz 2012-10-12 14:48:17 UTC
I'm sorry, I don't know what a memchecker is :-( My role has typically been as a sysadmin, not a developer.

The /etc/qconfig file on the affected system is mostly comments (lines prefixed with '*') but the one printer we have defined looks like this:

poascii_dev:
        device = @optmsfaxdev01v
        up = TRUE
        host = optmsfaxdev01v
        s_statfilter = /usr/lib/lpd/aixshort
        l_statfilter = /usr/lib/lpd/aixlong
        rq = LawsonPO
@optmsfaxdev01v:
        backend = /usr/lib/lpd/rembak

HTH