14640 – socket_wrapper 1.3.3 should be backported in order to fix deadlocks in the tfork test

Bug 14640 - socket_wrapper 1.3.3 should be backported in order to fix deadlocks in the tfork test

Summary: socket_wrapper 1.3.3 should be backported in order to fix deadlocks in the tf...

Status:	RESOLVED FIXED

Alias:	None

Product:	Samba 4.1 and newer
Classification:	Unclassified
Component:	Test infrastructure (show other bugs)
Version:	4.14.0rc2
Hardware:	All All

Importance:	P5 critical (vote)
Target Milestone:	4.14
Assignee:	Karolin Seeger
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-02-16 16:32 UTC by Stefan Metzmacher
Modified:	2021-05-11 09:49 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
Patch for v4-14-test (62.38 KB, patch) 2021-02-16 16:32 UTC, Stefan Metzmacher	no flags	Details
Patch for v4-13-test (62.38 KB, patch) 2021-02-16 16:32 UTC, Stefan Metzmacher	no flags	Details
Patches for v4-12-test (89.70 KB, patch) 2021-02-16 16:33 UTC, Stefan Metzmacher	no flags	Details
Patches for v4-14-test (76.71 KB, patch) 2021-03-23 11:19 UTC, Stefan Metzmacher	asn: review+	Details
Patches for v4-13-test (76.71 KB, text/plain) 2021-03-23 11:32 UTC, Stefan Metzmacher	asn: review+	Details
Show Obsolete (3) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Stefan Metzmacher 2021-02-16 16:32:18 UTC

Created attachment 16454 [details]
Patch for v4-14-test

From time to time we see deadlocks on socket_reset_mutex in combination
with forking.

These problems should be fixed in socket_wrapper 1.3.2.

Comment 1 Stefan Metzmacher 2021-02-16 16:32:44 UTC

Created attachment 16455 [details]
Patch for v4-13-test

Comment 2 Stefan Metzmacher 2021-02-16 16:33:13 UTC

Created attachment 16456 [details]
Patches for v4-12-test

Comment 3 Stefan Metzmacher 2021-02-17 11:58:34 UTC

We'll need socket_wrapper 1.3.3

Comment 4 Stefan Metzmacher 2021-02-17 14:38:12 UTC

The problem with 1.3.2 is this:

   #7 abort + 0x12b [ip=0x7f14fb670859] [sp=0x7fffd08856f0]
   #8 _swrap_mutex_lock + 0x102 [ip=0x7f14fc207a7d] [sp=0x7fffd0885820]
   #9 swrap_sendmsg_before + 0xd0 [ip=0x7f14fc212f0e] [sp=0x7fffd0885880]
   #10 swrap_write + 0x129 [ip=0x7f14fc214ca6] [sp=0x7fffd0885920]
   #11 write + 0x3b [ip=0x7f14fc214d8c] [sp=0x7fffd0885a50]
   #12 swrap_pcap_dump_packet + 0xc5 [ip=0x7f14fc20ca19] [sp=0x7fffd0885a90]
   #13 swrap_accept + 0x821 [ip=0x7f14fc20d9e2] [sp=0x7fffd0885b00]
   #14 accept + 0x3d [ip=0x7f14fc20db26] [sp=0x7fffd0886050]
   #15 prefork_listen_accept_handler + 0x1c0 [ip=0x7f14fbc4e06f] [sp=0x7fffd0886090]
   #16 tevent_common_invoke_fd_handler + 0x118 [ip=0x7f14fbcc3219] [sp=0x7fffd0886180]
   #17 epoll_event_loop + 0x3a9 [ip=0x7f14fbccf785] [sp=0x7fffd08861d0]
   #18 epoll_event_loop_once + 0x13c [ip=0x7f14fbccfe9f] [sp=0x7fffd0886230]
   #19 std_event_loop_once + 0x6f [ip=0x7f14fbccc0da] [sp=0x7fffd0886280]
   #20 _tevent_loop_once + 0x126 [ip=0x7f14fbcc20cd] [sp=0x7fffd08862c0]

It happens with a stale fd closed via __close_nocancel() in nss_host. 
While socket() is a weak symbol in libc.so.6, so swrap_socket can be injected
into the resolver code in libc.so.6, but the socket is closed with __close_nocancel, which is not a weak symbol in libc.so.6, and it's not
possible to catch the close of the fd and it remains stale in the
socket_wrapper table.

Comment 5 Samba QA Contact 2021-03-17 23:54:08 UTC

This bug was referenced in samba master:

10c198827d977e07b411897556578d3aedce2184

Comment 6 Stefan Metzmacher 2021-03-23 11:19:57 UTC

Created attachment 16564 [details]
Patches for v4-14-test

Comment 7 Stefan Metzmacher 2021-03-23 11:32:30 UTC

Created attachment 16565 [details]
Patches for v4-13-test

Comment 8 Andreas Schneider 2021-03-30 08:45:03 UTC

Karolin, could you please apply the patches to the relevant branches? Thanks!

Comment 9 Karolin Seeger 2021-03-31 09:03:28 UTC

Pushed to autobuild-v4-{14,13}-test.

Comment 10 Samba QA Contact 2021-03-31 10:14:20 UTC

This bug was referenced in samba v4-13-test:

f2be1673edee566088df92e2b9ecbe1678293780

Comment 11 Samba QA Contact 2021-03-31 11:11:04 UTC

This bug was referenced in samba v4-14-test:

a0862d6d6dee5f21bebf8987e3e7a21a42198b3b

Comment 12 Karolin Seeger 2021-04-01 10:27:31 UTC

Pushed to both branches.
Closing out bug report.

Thanks!

Comment 13 Samba QA Contact 2021-04-20 10:10:40 UTC

This bug was referenced in samba v4-14-stable (Release samba-4.14.3):

a0862d6d6dee5f21bebf8987e3e7a21a42198b3b

Comment 14 Samba QA Contact 2021-05-11 09:49:13 UTC

This bug was referenced in samba v4-13-stable:

f2be1673edee566088df92e2b9ecbe1678293780