Bug 15202 - writev epoll_wait cpu-spinning in the LDAP server (and maybe other places)
Summary: writev epoll_wait cpu-spinning in the LDAP server (and maybe other places)
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: 4.17.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL: https://gitlab.com/samba-team/samba/-...
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-12 15:23 UTC by Stefan Metzmacher
Modified: 2022-10-31 21:04 UTC (History)
3 users (show)

See Also:


Attachments
Patch for Samba 4.16 (40.30 KB, patch)
2022-10-20 03:50 UTC, Andrew Bartlett
metze: review+
Details
Patch for Samba 4.17 (40.38 KB, patch)
2022-10-20 03:51 UTC, Andrew Bartlett
metze: review+
Details
Patch backported to Samba 4.12 (40.45 KB, patch)
2022-10-27 22:37 UTC, Andrew Bartlett
abartlet: ci-passed+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Metzmacher 2022-10-12 15:23:21 UTC
From https://lists.samba.org/archive/samba-technical/2022-September/137647.html:

I've been trying to chase down the CPU spins reported by our users in
the writev() codepath from our LDAP server.

A private mail the the strace output shows the sockets are in
CLOSE_WAIT state, returning EAGAIN over and over (after a call to
epoll() each time).  That alone would be enough to keep things
spinning.

But they are being shut down, however our LDAP server won't be
triggering a read any time soon, it is waiting to flush the response
out.

Technically even after our server OS has got the FIN, there is
potentially data in the read buffer (so a read() might not return 0
anyway), but perhaps most of the time that would be 0.

So how can we detect this?  Can we at least put a timeout on a writev()
call via tsocket et al?  If so, how do we do that?

Mailing list threads: 

https://lists.samba.org/archive/samba/2022-September/241869.html
https://lists.samba.org/archive/samba/2022-September/241873.html

Andrew Bartlett
Comment 1 Samba QA Contact 2022-10-19 17:14:04 UTC
This bug was referenced in samba master:

f0fb8b9508346aed50528216fd959a9b1a941409
9950efd83e1a4b5e711f1d36fefa8a5d5e8b2410
29a65da63d730ecead1e7d4a81a76dd1c8c179ea
4c7e2b9b60de5d02bb3f69effe7eddbf466a6155
e232ba946f00aac39d67197d9939bc923814479c
eb2f3526032803f34c88ef1619a832a741f71910
Comment 2 Andrew Bartlett 2022-10-20 03:50:13 UTC
Created attachment 17589 [details]
Patch for Samba 4.16
Comment 3 Andrew Bartlett 2022-10-20 03:51:26 UTC
Created attachment 17590 [details]
Patch for Samba 4.17
Comment 4 Andrew Bartlett 2022-10-27 22:37:11 UTC
Created attachment 17605 [details]
Patch backported to Samba 4.12

The 4.12 patches are included in this tree (so was tested with the other patches in this tree):
https://gitlab.com/catalyst-samba/samba/-/releases/catalyst-4.12-backports-2022-10
Comment 5 Jule Anger 2022-10-31 09:02:10 UTC
Pushed to autobuild-v4-{17,16}-test.
Comment 6 Samba QA Contact 2022-10-31 10:09:12 UTC
This bug was referenced in samba v4-17-test:

dcac415e9493fe14eb0972ac0c97f66b02a232d0
8a4ef3d92e7df83245a76a2396ee328a940a1cf2
5c051d3806521e2e25a2a8a1e459d1d69722c96f
419986dcc0bc850e82f1d0229fbe57a3be8bb59e
b615bf4333a1a1a3c80bd93a186f1a137c8b13dc
743a56e5ccf358deb7b7093c55ea796e7000de3f
Comment 7 Samba QA Contact 2022-10-31 15:32:12 UTC
This bug was referenced in samba v4-16-test:

c805ccba33985ca07da63b4be3affafb495e13a1
119bf609985a873346891c5ca55e69178d712eb0
d8d5146d1679383747ad0533759f97020b78221e
aeb7dd2ca89e7a010baaf3a5da17eaa466ace06e
bc16a8abe3f1446a0da7e672cdba469fcc8ef96a
f7a84cffe9d9c61df7a7c5dd94e9caf3d18d9b3c
Comment 8 Jule Anger 2022-10-31 21:04:20 UTC
Closing out bug report.

Thanks!