Bug 5267 - nmbd shuts down when network interfaces go down
Summary: nmbd shuts down when network interfaces go down
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: nmbd (show other bugs)
Version: 3.0.28
Hardware: x86 Linux
: P3 normal
Target Milestone: none
Assignee: Jeremy Allison
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-17 06:31 UTC by Sam Morris
Modified: 2008-08-18 08:19 UTC (History)
3 users (show)

See Also:


Attachments
Patch for 3.2 (3.88 KB, patch)
2008-03-06 18:34 UTC, Jeremy Allison
no flags Details
Patch for 3.0.28a (8.39 KB, patch)
2008-03-06 18:53 UTC, Jeremy Allison
no flags Details
Patch for 3.0.28a (3.78 KB, patch)
2008-03-06 18:57 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Morris 2008-02-17 06:31:16 UTC
[forwarded from http://bugs.debian.org/433449]

My system's network connection is a wireless one, and whenever I bring
the interface down to change networks, or get the hardware's driver back
into an unconfused state (it's not the most stable piece of coding),
nmbd exits.

[2007/07/17 09:00:19, 0] nmbd/nmbd.c:reload_interfaces(229)
  reload_interfaces: No subnets to listen to. Shutting down...

It should just hang around, listening on localhost and/or 0.0.0.0 (all
interfaces) until another network interface comes back up. Currently, I
have to start it manually whenever it shuts down.
Comment 1 Jeremy Allison 2008-03-06 18:34:41 UTC
Created attachment 3166 [details]
Patch for 3.2

Here's a patch for 3.2. Can you test this - I'll also back-port for 3.0.28a. Please let me know if this works for you. Thanks.
Jeremy.
Comment 2 Jeremy Allison 2008-03-06 18:53:49 UTC
Created attachment 3167 [details]
Patch for 3.0.28a

Patch for 3.0.28a - please test !
Thanks,
Jeremy.
Comment 3 Jeremy Allison 2008-03-06 18:57:55 UTC
Created attachment 3168 [details]
Patch for 3.0.28a

Correct patch for 3.0.28a (nmbd change only).
Jeremy.
Comment 4 Jeremy Allison 2008-03-06 19:45:49 UTC
Should be fixed for 3.0.28a and above (git-pushed).
Jeremy.
Comment 5 Steve Langasek 2008-03-16 17:35:07 UTC
Hi Jeremy,

Unfortunately I don't think this issue is completely resolved, and in fact this patch introduces a regression in 3.0.28a.  If you use an 'interfaces' line in smb.conf, the interface goes away, and you kill -HUP nmbd to get it to reopen its logfiles, nmbd now goes into an infinite loop, spewing thousands of line per second to the log:

[2008/03/16 15:33:16, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..
[2008/03/16 15:33:16, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..
[2008/03/16 15:33:16, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..

and the process becomes unkillable with SIGTERM or SIGHUP, and never notices when the interface is back up.
Comment 6 Jeremy Allison 2008-03-17 13:39:01 UTC
Hmmm. That's an unusual set of conditions to satisfy. I did test that this code works correctly keeping nmbd running when the interafaces go away and come back, so if it's only a problem with doing SIGHUP when it's in the "no interfaces" state that's not such a big deal (although annoying) and we'll just fix for 3.2 official. Can you get me a little more info on when it gets into this state ?
Jeremy.
Comment 7 Steve Langasek 2008-03-25 12:31:30 UTC
I'll grant you that it's not the usual circumstances, but it's also not a contrived scenario.

- Debian/Ubuntu ship a logrotate script for log.smbd and log.nmbd, because samba's built-in log rotation capabilities are inconsistent with the rest of the system (supports rotating by size, not by age AFAIK, and don't support keeping any more than one old log)
- network connectivity can fail from time to time, sometimes spontaneously from the POV of the end user; or in the case of mobile devices, the user may have roamed off the network at the time the logrotate script fires
- then the user just has to have set the 'interfaces' line, and BAM!
Comment 8 Jeremy Allison 2008-03-25 12:44:54 UTC
No problem, I'm not refusing a fix, just saying it's not a panic/security fix situation. I'll fix this once I'm back at work (dealing with pneumonia at the moment).
Jeremy.
Comment 9 Jeremy Allison 2008-03-27 15:31:52 UTC
I can't reproduce the problem you are claiming. I've got the latest 3.0.x git code (same as 3.0.28a in nmbd), have brought up nmbd with a line :

interfaces = eth1 

on my laptop with wireless. I then start nmbd and then bring down the interface using ifconfig eth1 down. After 30 seconds nmbd goes into the state printing :

WARNING: no network interfaces found.

It prints out this message into the log file every 5 seconds until the interface comes back up.

I need to know *exactly* how you are reproducing your problem, and I also want you to use clean source code (ie. no debian patches).

Jeremy.
Comment 10 Jeremy Allison 2008-03-27 16:24:42 UTC
I reproduced and fixed the not terminating problem (pushed into 3.2 and 3.0.x git), but I still can't reproduce the loop you reported.
Jeremy.
Comment 11 Steve Langasek 2008-06-19 23:08:41 UTC
I'm not now able to reproduce the earlier looping problem here (sorry - no idea what would have changed), so I'm happy to consider this closed.
Comment 12 Jeremy Allison 2008-06-20 11:43:41 UTC
Submitter reports fixed.
Jeremy.
Comment 13 Steve Langasek 2008-07-23 14:18:28 UTC
The loop has happened to me again:

[2008/07/23 11:57:42, 2] nmbd/nmbd.c:reload_interfaces(193)
  reload_interfaces: Ignoring loopback interface 127.0.0.1
[2008/07/23 11:57:42, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..
[2008/07/23 11:57:42, 2] nmbd/nmbd.c:reload_interfaces(193)
  reload_interfaces: Ignoring loopback interface 127.0.0.1
[2008/07/23 11:57:42, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..
[2008/07/23 11:57:42, 2] nmbd/nmbd.c:reload_interfaces(193)
  reload_interfaces: Ignoring loopback interface 127.0.0.1
[2008/07/23 11:57:42, 0] nmbd/nmbd.c:reload_interfaces(239)
  reload_interfaces: No subnets to listen to. Waiting..
[2008/07/23 11:57:42, 2] nmbd/nmbd.c:reload_interfaces(193)
  reload_interfaces: Ignoring loopback interface 127.0.0.1
[2008/07/23 11:57:42, 0] nmbd/nmbd.c:reload_interfaces(239)

No idea how to reproduce this. :/

I currently have the following config options set:

        interfaces = lo, eth1
        dns proxy = No
        wins server = eth1:64.22.192.146, eth1:64.22.192.146

I of course have no idea when this started, since the log quickly overflows, so in the time since I started writing this report, /var/log/samba/log.nmbd.old has  been overwritten several times over.

This is using the Ubuntu Samba package, not the pristine upstream sources; however, the only patches applied to nmbd are to do with file paths, all of which have been included upstream in 3.2 with the exception of a path change in nmbd_serverlistdb.c.

Some info from attaching with gdb:

(gdb) bt
#0  0x0000000000495b37 in iface_count () at lib/interface.c:298
#1  0x0000000000427e0b in reload_interfaces (t=<value optimized out>)
    at nmbd/nmbd.c:248
#2  0x0000000000428ac0 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at nmbd/nmbd.c:584
(gdb) info locals
ret = <value optimized out>
i = (struct interface *) 0x7bc600
(gdb) print *i
$3 = {next = 0x0, prev = 0x0, ip = {s_addr = 16777343}, bcast = {
    s_addr = 4294967167}, nmask = {s_addr = 255}}
(gdb) fin
(gdb) bt
#0  0x0000000000427e0d in reload_interfaces (t=<value optimized out>)
    at nmbd/nmbd.c:248
#1  0x0000000000428ac0 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at nmbd/nmbd.c:584
(gdb) info locals
saved_handler = (void (*)(int)) 0x428e20 <sig_term>
n = <value optimized out>
subrec = (struct subnet_record *) 0x0
lastt = 1216795692
__FUNCTION__ = "reload_interfaces"
Comment 14 Ted Percival 2008-08-18 08:19:48 UTC
I opened bug #5697 to discuss the infinite loop that the fix for this bug apparently introduced. In particular I think it might happen when an interface is up but has no IPv4 address, as tends to be the case when using NetworkManager/wpasupplicant.