Bug 14430 - smbd-notifyd O(n*n) performance issue with n watches registered for the same folder
Summary: smbd-notifyd O(n*n) performance issue with n watches registered for the same ...
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Other (show other bugs)
Version: 4.11.9
Hardware: All All
: P5 critical (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-02 22:22 UTC by YOUZHONG YANG
Modified: 2020-09-16 06:38 UTC (History)
4 users (show)

See Also:


Attachments
patch file based on Samba 4.11.9 (2.20 KB, text/plain)
2020-07-02 22:22 UTC, YOUZHONG YANG
no flags Details
pcap file showing many change notification requests on the same folder (2.72 MB, application/vnd.tcpdump.pcap)
2020-07-07 02:45 UTC, YOUZHONG YANG
no flags Details
Patch to open many notifies on a single connection and directory (5.94 KB, patch)
2020-07-07 12:32 UTC, Volker Lendecke
no flags Details
Fixed torture patch (9.02 KB, patch)
2020-07-07 12:35 UTC, Volker Lendecke
no flags Details
Windows C program for reproducing the issue (3.15 KB, text/plain)
2020-07-07 15:42 UTC, YOUZHONG YANG
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description YOUZHONG YANG 2020-07-02 22:22:58 UTC
Created attachment 16109 [details]
patch file based on Samba 4.11.9

If a Windows client sends change notification requests n times for the same folder, smbd-notifyd triggers at least n*n messages to the requesting smbd process when something happens to the folder.

We noticed this issue multiple times when the memory usage of smbd-notifyd process was huge, in one case it was 250GiB. Eventually the issue was reproduced by working with the end user who experienced the issue.

I understand it is unusual for a client to send change notification requests for the same folder again and again, but a node.js component named "webpack-dev-server" actually does it.

I have come up with a potential fix to the issue and verified that it fixes our problem. Please see the attached diff file.
Comment 1 Jeremy Allison 2020-07-07 00:58:18 UTC
Can you clarify exactly what the client is doing to trigger this ?

My understanding is that the client, on the same connection, is repeatedly issuing identical change-notify requests on an open handle ?

Is that correct ? It would be good to see a wireshark trace of this so we can create a regression test to reproduce this.

Thanks !
Comment 2 YOUZHONG YANG 2020-07-07 02:45:42 UTC
Created attachment 16115 [details]
pcap file showing many change notification requests on the same folder
Comment 3 YOUZHONG YANG 2020-07-07 02:50:49 UTC
(In reply to Jeremy Allison from comment #1)
A pcap file was attached to show how a Windows client sends 3000+ change notification requests on the same folder. No change was triggered on that folder as I didn't want to receive 9,000,000 responses.

This issue has been reproduced on both Linux and illumos system.

I also have a Windows C program which can be used to reproduce the issue reliably, please let me know if it is needed.

Thanks!
Comment 4 Jeremy Allison 2020-07-07 04:39:26 UTC
The Windows C program would be really helpful also - thanks !
Comment 5 Volker Lendecke 2020-07-07 12:32:27 UTC
Created attachment 16116 [details]
Patch to open many notifies on a single connection and directory

Attached find a work-in-progress patch that implements important bits of the problematic network trace.

smbtorture3 //127.0.0.1/tmp -Uuser%pass notify-bench4 -o 2000

opens 2000 handles and notifies and goes to sleep after that. It is just an initial step towards further analysis, which I don't have time for at this moment. So this is just a snapshot that might help others to take a closer look before I can return to this case.
Comment 6 Volker Lendecke 2020-07-07 12:35:27 UTC
Created attachment 16117 [details]
Fixed torture patch

Ooops, wrong patch posted. I had uncommitted changes in my working tree that did not make it to patch 16166. Sorry for the hickup.
Comment 7 YOUZHONG YANG 2020-07-07 15:42:27 UTC
Created attachment 16118 [details]
Windows C program for reproducing the issue

The requested C program was attached. Thanks.