Bug 14640 - socket_wrapper 1.3.3 should be backported in order to fix deadlocks in the tfork test
Summary: socket_wrapper 1.3.3 should be backported in order to fix deadlocks in the tf...
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Test infrastructure (show other bugs)
Version: 4.14.0rc2
Hardware: All All
: P5 critical (vote)
Target Milestone: 4.14
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-16 16:32 UTC by Stefan Metzmacher
Modified: 2021-05-11 09:49 UTC (History)
2 users (show)

See Also:


Attachments
Patch for v4-14-test (62.38 KB, patch)
2021-02-16 16:32 UTC, Stefan Metzmacher
no flags Details
Patch for v4-13-test (62.38 KB, patch)
2021-02-16 16:32 UTC, Stefan Metzmacher
no flags Details
Patches for v4-12-test (89.70 KB, patch)
2021-02-16 16:33 UTC, Stefan Metzmacher
no flags Details
Patches for v4-14-test (76.71 KB, patch)
2021-03-23 11:19 UTC, Stefan Metzmacher
asn: review+
Details
Patches for v4-13-test (76.71 KB, text/plain)
2021-03-23 11:32 UTC, Stefan Metzmacher
asn: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Metzmacher 2021-02-16 16:32:18 UTC
Created attachment 16454 [details]
Patch for v4-14-test

From time to time we see deadlocks on socket_reset_mutex in combination
with forking.

These problems should be fixed in socket_wrapper 1.3.2.
Comment 1 Stefan Metzmacher 2021-02-16 16:32:44 UTC
Created attachment 16455 [details]
Patch for v4-13-test
Comment 2 Stefan Metzmacher 2021-02-16 16:33:13 UTC
Created attachment 16456 [details]
Patches for v4-12-test
Comment 3 Stefan Metzmacher 2021-02-17 11:58:34 UTC
We'll need socket_wrapper 1.3.3
Comment 4 Stefan Metzmacher 2021-02-17 14:38:12 UTC
The problem with 1.3.2 is this:

   #7 abort + 0x12b [ip=0x7f14fb670859] [sp=0x7fffd08856f0]
   #8 _swrap_mutex_lock + 0x102 [ip=0x7f14fc207a7d] [sp=0x7fffd0885820]
   #9 swrap_sendmsg_before + 0xd0 [ip=0x7f14fc212f0e] [sp=0x7fffd0885880]
   #10 swrap_write + 0x129 [ip=0x7f14fc214ca6] [sp=0x7fffd0885920]
   #11 write + 0x3b [ip=0x7f14fc214d8c] [sp=0x7fffd0885a50]
   #12 swrap_pcap_dump_packet + 0xc5 [ip=0x7f14fc20ca19] [sp=0x7fffd0885a90]
   #13 swrap_accept + 0x821 [ip=0x7f14fc20d9e2] [sp=0x7fffd0885b00]
   #14 accept + 0x3d [ip=0x7f14fc20db26] [sp=0x7fffd0886050]
   #15 prefork_listen_accept_handler + 0x1c0 [ip=0x7f14fbc4e06f] [sp=0x7fffd0886090]
   #16 tevent_common_invoke_fd_handler + 0x118 [ip=0x7f14fbcc3219] [sp=0x7fffd0886180]
   #17 epoll_event_loop + 0x3a9 [ip=0x7f14fbccf785] [sp=0x7fffd08861d0]
   #18 epoll_event_loop_once + 0x13c [ip=0x7f14fbccfe9f] [sp=0x7fffd0886230]
   #19 std_event_loop_once + 0x6f [ip=0x7f14fbccc0da] [sp=0x7fffd0886280]
   #20 _tevent_loop_once + 0x126 [ip=0x7f14fbcc20cd] [sp=0x7fffd08862c0]

It happens with a stale fd closed via __close_nocancel() in nss_host. 
While socket() is a weak symbol in libc.so.6, so swrap_socket can be injected
into the resolver code in libc.so.6, but the socket is closed with __close_nocancel, which is not a weak symbol in libc.so.6, and it's not
possible to catch the close of the fd and it remains stale in the
socket_wrapper table.
Comment 5 Samba QA Contact 2021-03-17 23:54:08 UTC
This bug was referenced in samba master:

10c198827d977e07b411897556578d3aedce2184
Comment 6 Stefan Metzmacher 2021-03-23 11:19:57 UTC
Created attachment 16564 [details]
Patches for v4-14-test
Comment 7 Stefan Metzmacher 2021-03-23 11:32:30 UTC
Created attachment 16565 [details]
Patches for v4-13-test
Comment 8 Andreas Schneider 2021-03-30 08:45:03 UTC
Karolin, could you please apply the patches to the relevant branches? Thanks!
Comment 9 Karolin Seeger 2021-03-31 09:03:28 UTC
Pushed to autobuild-v4-{14,13}-test.
Comment 10 Samba QA Contact 2021-03-31 10:14:20 UTC
This bug was referenced in samba v4-13-test:

f2be1673edee566088df92e2b9ecbe1678293780
Comment 11 Samba QA Contact 2021-03-31 11:11:04 UTC
This bug was referenced in samba v4-14-test:

a0862d6d6dee5f21bebf8987e3e7a21a42198b3b
Comment 12 Karolin Seeger 2021-04-01 10:27:31 UTC
Pushed to both branches.
Closing out bug report.

Thanks!
Comment 13 Samba QA Contact 2021-04-20 10:10:40 UTC
This bug was referenced in samba v4-14-stable (Release samba-4.14.3):

a0862d6d6dee5f21bebf8987e3e7a21a42198b3b
Comment 14 Samba QA Contact 2021-05-11 09:49:13 UTC
This bug was referenced in samba v4-13-stable:

f2be1673edee566088df92e2b9ecbe1678293780