Bug 6696 - smbd 3.3.7 crashes (signal 11) in dns_register_smbd_reply
smbd 3.3.7 crashes (signal 11) in dns_register_smbd_reply
Status: RESOLVED FIXED
Product: Samba 3.3
Classification: Unclassified
Component: File services
3.3.7
Other Linux
: P3 regression
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
: 6734 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-06 23:07 UTC by Timothy Miller
Modified: 2010-01-11 05:35 UTC (History)
2 users (show)

See Also:


Attachments
Patch in git-format-patch form (737 bytes, patch)
2009-09-07 05:05 UTC, Volker Lendecke
obnox: review+
Details
Second part of fix for 3.3.x. (1.08 KB, patch)
2009-09-08 19:25 UTC, Jeremy Allison
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy Miller 2009-09-06 23:07:14 UTC
I believe I have found a bug in smbd.  I've never experienced this before, nor has anyone else, but looking at the source code (see below), I'm really surprised this hasn't arisen before.

Since I'm using this on Gentoo, I've filed a similar report there:
http://bugs.gentoo.org/show_bug.cgi?id=283919

I've also been discussing this on the mailing list:
http://lists.samba.org/archive/samba/2009-September/150351.html


At the moment a client attempts to connect, smbd is crashing with signal 11.  Here's the relevant bit of the log:

[2009/09/06 22:24:44,  0] smbd/server.c:main(1274)
 smbd version 3.3.7 started.
 Copyright Andrew Tridgell and the Samba Team 1992-2009
[2009/09/06 22:24:44,  0] printing/print_cups.c:cups_connect(103)
 Unable to connect to CUPS server /var/run/cups/cups.sock:631 - No
such file or directory
[2009/09/06 22:24:44,  0] printing/print_cups.c:cups_connect(103)
 Unable to connect to CUPS server /var/run/cups/cups.sock:631 - No
such file or directory
[2009/09/06 22:26:09,  0] smbd/server.c:main(1274)
 smbd version 3.3.7 started.
 Copyright Andrew Tridgell and the Samba Team 1992-2009
[2009/09/06 22:26:09,  0] printing/print_cups.c:cups_connect(103)
 Unable to connect to CUPS server /var/run/cups/cups.sock:631 - No
such file or directory
[2009/09/06 22:26:09,  0] printing/print_cups.c:cups_connect(103)
 Unable to connect to CUPS server /var/run/cups/cups.sock:631 - No
such file or directory
[2009/09/06 22:26:43,  0] lib/fault.c:fault_report(40)
 ===============================================================
[2009/09/06 22:26:43,  0] lib/fault.c:fault_report(41)
 INTERNAL ERROR: Signal 11 in pid 16066 (3.3.7)
 Please read the Trouble-Shooting section of the Samba3-HOWTO
[2009/09/06 22:26:43,  0] lib/fault.c:fault_report(43)

 From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
[2009/09/06 22:26:43,  0] lib/fault.c:fault_report(44)
 ===============================================================
[2009/09/06 22:26:43,  0] lib/util.c:smb_panic(1673)
 PANIC (pid 16066): internal error
[2009/09/06 22:26:43,  0] lib/util.c:log_stack_trace(1777)
 BACKTRACE: 8 stack frames:
  #0 /usr/sbin/smbd(log_stack_trace+0x1c) [0x7f4fdfff6b10]
  #1 /usr/sbin/smbd(smb_panic+0x5b) [0x7f4fdfff6c1d]
  #2 /usr/sbin/smbd [0x7f4fdffe3e71]
  #3 /lib/libpthread.so.0 [0x7f4fde09bef0]
  #4 /usr/sbin/smbd(dns_register_smbd_reply+0x1c) [0x7f4fdfe59e3b]
  #5 /usr/sbin/smbd(main+0x16e8) [0x7f4fe01f05cc]
  #6 /lib/libc.so.6(__libc_start_main+0xe6) [0x7f4fdca49a26]
  #7 /usr/sbin/smbd [0x7f4fdfde1339]
[2009/09/06 22:26:43,  0] lib/fault.c:dump_core(231)
 dumping core in /var/log/samba/cores/smbd


Note that I fixed the CUPS issue, but that didn't help.


Using gdb, this is the stack trace:

#0  0x00007f4fdca5d645 in raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
       in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) where
#0  0x00007f4fdca5d645 in raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f4fdca5eb63 in abort () at abort.c:88
#2  0x00007f4fdffe38db in dump_core () at lib/fault.c:242
#3  0x00007f4fdfff6d3b in smb_panic (why=<value optimized out>) at
lib/util.c:1689
#4  0x00007f4fdffe3e71 in sig_fault (sig=11) at lib/fault.c:46
#5  <signal handler called>
#6  dns_register_smbd_reply (dns_state=0x0, lfds=0x7ffff4963ed0,
timeout=0x7ffff4964060) at smbd/dnsregister.c:171
#7  0x00007f4fe01f05cc in main (argc=<value optimized out>,
argv=<value optimized out>) at smbd/server.c:689


To make a long story short, in server.c, there's this code:

static bool open_sockets_smbd(bool is_daemon, bool interactive, const char
*smb_ports)
{
...
       struct dns_reg_state * dns_reg = NULL;

... nothing that modifies dns_reg ...

               /* process pending nDNS responses */
               if (dns_register_smbd_reply(dns_reg, &r_fds, &idle_timeout)) {
                       --num;
               }
...
}


Then the function dns_register_smbd_reply (disregister.c) blindly rereferences
the first argument:

bool dns_register_smbd_reply(struct dns_reg_state *dns_state,
               fd_set *lfds, struct timeval *timeout)
{
       int mdnsd_conn_fd = -1;

       if (dns_state->srv_ref == NULL) {
               return false;
       }
...
}


I was hoping someone could help me figure out why this is happening now and didn't before, so I can work around it.  

Thanks.
Comment 1 Timothy Miller 2009-09-06 23:49:38 UTC
This "patch" solves the problem:

bool dns_register_smbd_reply(struct dns_reg_state *dns_state,
                fd_set *lfds, struct timeval *timeout)
{
        int mdnsd_conn_fd = -1;

+        if (!dns_state) return false;
        if (dns_state->srv_ref == NULL) {
                return false;
        }

Comment 2 Volker Lendecke 2009-09-07 05:05:07 UTC
Created attachment 4657 [details]
Patch in git-format-patch form

Thanks a lot for that patch! I never got around to diagnose this, I always recommend to use the native avahi support.

Volker
Comment 3 Michael Adam 2009-09-07 05:11:23 UTC
Comment on attachment 4657 [details]
Patch in git-format-patch form

The patch is obviously correct. => ACK
Comment 4 Michael Adam 2009-09-07 05:15:14 UTC
Karolin, please pick this for the next 3.3. bugfix release.
Cheers - Michael
Comment 5 Timothy Miller 2009-09-07 13:13:25 UTC
BTW, should we try to figure out why samba, for me, suddenly started taking this code path when apparently it never had before?  This could uncover another bug.
Comment 6 Karolin Seeger 2009-09-08 02:55:17 UTC
Pushed to v3-3-test and v3-4-test (will be included in 3.4.1).
Comment 7 Karolin Seeger 2009-09-08 02:56:58 UTC
(In reply to comment #5)
> BTW, should we try to figure out why samba, for me, suddenly started taking
> this code path when apparently it never had before?  This could uncover another
> bug.

Jeremy, should anyone investigate?
If not, the bug report can be closed as the patch has been pushed.

Thanks!
Comment 8 Jeremy Allison 2009-09-08 18:54:07 UTC
The patch is correct but essentially disables the mDNS code. There is a missing chunk of code from 3.2.x that should be in the 3.3.x main loop.
I'll add this code back in.
Jeremy.
Comment 9 Jeremy Allison 2009-09-08 19:25:18 UTC
Created attachment 4665 [details]
Second part of fix for 3.3.x.

Fix in git-format-patch format.
Jeremy.
Comment 10 Timothy Miller 2009-09-08 20:05:47 UTC
(In reply to comment #8)
> The patch is correct but essentially disables the mDNS code. There is a missing
> chunk of code from 3.2.x that should be in the 3.3.x main loop.
> I'll add this code back in.
> Jeremy.
> 

Does this partially answer why I'm the only one to experience this problem?
Comment 11 Björn Jacke 2009-09-17 12:36:43 UTC
*** Bug 6734 has been marked as a duplicate of this bug. ***
Comment 12 Karolin Seeger 2009-09-29 03:10:59 UTC
Is there a chance to get the second patch into 3.3.8, too?
Comment 13 Volker Lendecke 2009-09-29 06:56:37 UTC
Well, I don't really know. I don't have the environment at hand to actually test this. Someone from Apple (James?) needs to ack it, everybody else should go with the native avahi interface.

Volker
Comment 14 Alexey Shildyakov 2009-10-05 08:26:07 UTC
Still presents on samba 3.3.8
Comment 15 Karolin Seeger 2009-10-08 04:17:34 UTC
What shall we do with this one?
Lowering security?
Comment 16 Volker Lendecke 2009-10-08 04:36:37 UTC
Well, I'd say just put it in. It can't get worse than it is now. If Apple does not react, there's not much we can do.

Volker
Comment 17 Volker Lendecke 2009-10-08 05:44:55 UTC
Comment on attachment 4665 [details]
Second part of fix for 3.3.x.

Just put it in.

Volker
Comment 18 Karolin Seeger 2010-01-11 05:35:23 UTC
Pushed both patches too v3-3-test. Will be included in Samba 3.3.10.
Closing out bug report.

Please reopen, if it's still an issue.

Thanks!