Bug 13620 - Network Meltdown after Samba 4.9.0 Upgrade
Summary: Network Meltdown after Samba 4.9.0 Upgrade
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.9.0
Hardware: All All
: P5 regression (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-15 13:57 UTC by Reuben Farrelly
Modified: 2018-09-24 07:25 UTC (History)
5 users (show)

See Also:


Attachments
proposed patch for master (1.20 KB, patch)
2018-09-15 14:38 UTC, Andrew Bartlett
no flags Details
git-am fix for master. (1.19 KB, patch)
2018-09-17 19:54 UTC, Jeremy Allison
jra: review? (abartlet)
asn: review+
Details
patch cherry-picked from master (1.32 KB, patch)
2018-09-19 23:58 UTC, Andrew Bartlett
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Reuben Farrelly 2018-09-15 13:57:22 UTC
This bug report is to track the fix for an issue I reported on the Samba mailing list:

https://lists.samba.org/archive/samba/2018-September/218233.html

In short:

===

Last night I attempted to upgrade from Samba 4.8.5 to 4.9.0, with disastrous results.  Upon starting Samba 4.9.0 my entire network came to a screaming halt a few seconds later, and upon shutting Samba down it came back to life again.

Just to be sure this wasn't a coincidence, I then started Samba again. Once again all connectivity stopped, but came back as soon as I was able to shut down Samba.

Network switches were all logging that they were shutting down physical ports due excessive numbers of broadcast packets being seen, and a Wireshark capture from my PC verified that indeed there really was a broadcast storm happening that was triggering this.

The capture showed that upon startup Samba 4.9.0 was sending thousands and thousands of broadcast packets onto the wire in very quick succession.  Wireshark counted around 6500 broadcasts in about 300ms. The packets are all Host Announcement packets sent from the IPv4 address of the host to the broadcast address of the subnet the Samba is on.

Upon reverting back to 4.8.5 with no other config changes, everything is back to normal again. 

===

This bug appears to be fixed with the following commit:


From 6a324eb8bdf4bb2d661aaf60efb270c55562f62f Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet@samba.org>
Date: Sat, 15 Sep 2018 05:42:11 -0700
Subject: [PATCH] Revert "s3:nmbd: Fix possible integer overflow"

This reverts commit 3a383038ee7f74e5a9d2326a761b27950a14eb83.
---
 source3/nmbd/nmbd_sendannounce.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/source3/nmbd/nmbd_sendannounce.c b/source3/nmbd/nmbd_sendannounce.c
index 44d67e7..1d557c4 100644
--- a/source3/nmbd/nmbd_sendannounce.c
+++ b/source3/nmbd/nmbd_sendannounce.c
@@ -288,10 +288,8 @@ void announce_my_server_names(time_t t)
 			}
 
 			/* Announce every minute at first then progress to every 12 mins */
-			if (t > work->lastannounce_time &&
-			    (t - work->lastannounce_time) < work->announce_interval) {
+			if ((t - work->lastannounce_time) < work->announce_interval)
 				continue;
-			}
 
 			if (work->announce_interval < (CHECK_TIME_MAX_HOST_ANNCE * 60))
 				work->announce_interval += 60;
-- 

After applying this patch above, Samba functions normally again.
Comment 1 Andrew Bartlett 2018-09-15 14:02:48 UTC
Merge request (and CI) here: 

https://gitlab.com/samba-team/samba/merge_requests/72

Are you happy to be credited in the commit message?

Thanks,

Andrew Bartlett
Comment 2 Andrew Bartlett 2018-09-15 14:03:45 UTC
Also, does it only happen there there is more than one NIC connected?
Comment 3 Reuben Farrelly 2018-09-15 14:11:11 UTC
Yes - happy to be credited (thanks).

Unsure if it happens if there is more than one NIC.  The second NIC is used in a back-to-back link to another server which is offline almost all the time including around the time of this upgrade.  So at least from an IP perspective the NIC is there but there are no hosts seen on the segment.

If it is important I'll rebuild without the patch and test this scenario out (please let me know).
Comment 4 Andrew Bartlett 2018-09-15 14:37:56 UTC
OK, that should be OK.

I'll leave that up to those who know more about this layer to dig into.
Comment 5 Andrew Bartlett 2018-09-15 14:38:29 UTC
Created attachment 14489 [details]
proposed patch for master
Comment 6 Jeremy Allison 2018-09-17 19:54:40 UTC
Created attachment 14491 [details]
git-am fix for master.

I think this is the correct fix.
Comment 7 Karolin Seeger 2018-09-19 08:09:37 UTC
As the patch has been reviewed by Andreas, I can push it to autobuild, right?
Or should we wait for more feedback?
Can I go ahead with the bugfix release?
Comment 8 Reuben Farrelly 2018-09-19 12:42:52 UTC
This latest patch in attachment 14491 [details] works for me - I've just tested it now and I'm not seeing any sign of the original problem.
Comment 9 Andrew Bartlett 2018-09-19 13:30:11 UTC
(In reply to Karolin Seeger from comment #7)
Yes.

(In reply to Reuben Farrelly from comment #8)
Thanks.  That was all we were waiting for.

Andrew Bartlett
Comment 10 Andrew Bartlett 2018-09-19 23:58:05 UTC
Created attachment 14494 [details]
patch cherry-picked from master
Comment 11 Jeremy Allison 2018-09-20 00:30:04 UTC
Karolin please apply to 4.9.next.

Thanks !

Jeremy.
Comment 12 Karolin Seeger 2018-09-20 07:21:58 UTC
(In reply to Jeremy Allison from comment #11)
Pushed to autobuild-v4-9-test.
Comment 13 Karolin Seeger 2018-09-24 07:25:25 UTC
(In reply to Karolin Seeger from comment #12)
Pushed to v4-9-test.
Closing out bug report.

Thanks!