Bug 15115 - Multichannel channel failing causes server to get wedged for remaining channels
Summary: Multichannel channel failing causes server to get wedged for remaining channels
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.15.6
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Stefan Metzmacher
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-01 23:32 UTC by suinn
Modified: 2024-04-04 16:24 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description suinn 2022-07-01 23:32:29 UTC
In earlier versions of Samba when multichannel was still experimental, a channel fail during writes would cause a 10 second delay before the remaining channels would continue to respond.

In latest Samba with official multichannel support, now the server just gets wedged and the remaining channels stop responding forever.

Configuration:
QNAP TVS-x72XT, 4GB RAM, Four Samsung 860 EVO 1 TB drive as RAID0, Firmware 4.4.2.1320, SMB Server 8 MB max IO. 

Steps To Reproduce:
1. Connect to server and establish 2+ channels
2. Start a file transfer to the server (ie constant Writes)
3. While the writes are occurring, pull ethernet cable out or pull Ethernet dongle out of client computer.

Results:
In earlier Samba, the other remaining channels would stall for around 10 seconds and then the server would start responding on those channels.  With Version 4.15.6, the server just stops responding on all channels forever.

Expected Results:
Remaining channels should continue to respond without any delays when another channel goes down.  If you do a Read test, then the server works just fine in this manner.

In the attached packet trace at 2013763, one channel is disabled and you can see the 10 second delay before the remaining channel continues to respond.  In new Samba, all channels just stop responding forever.
Comment 1 suinn 2022-07-01 23:35:05 UTC
Well, bugzilla crashes when I upload the 1.59 GB packet trace, so no trace for you.
Comment 2 Jeremy Allison 2022-07-02 01:20:08 UTC
One for metze to comment on I think.
Comment 3 suinn 2022-09-30 16:55:53 UTC
Updated to latest QuTS hero version h5.0.0.2120 on my TVS-h1288x
smbd (samba daemon) Version 4.13.17

Retried my multichannel test and now its behaving as expected. As one channel goes down, the other channels continue to work. As channels are added, the other channels continue to work.

So, seems like something got fixed and its fine now.  This bug can be closed.
Comment 4 Stefan Metzmacher 2022-09-30 18:59:30 UTC
(In reply to suinn from comment #3)

Didn't you had the problems wit 4.15.6 ?

4.13.17 is the old one again that didn't have the problem before...
Comment 5 suinn 2022-09-30 20:49:55 UTC
I would have to ask QNAP engineer if they can build me a newer Samba for their NAS box to try.  Will query them.
Comment 6 suinn 2022-10-18 01:33:46 UTC
Ok, Jones got me a new build to try on my QNAP box.

[/root] # /usr/local/samba/sbin/smbd -V
Version 4.15.10

With that version installed on my QNAP and 3 channels of 1 GbE, pulling an Ethernet, the remaining channels continue just fine with no delays. Adding an Ethernet back in also works correctly with no delays.

Looks like its fixed.
Comment 7 Jones Syue 2022-10-28 02:41:27 UTC
> Looks like its fixed.

Sounds great! Thank you Brad and Metze :)