8409 – f7d97868 created a new race condition

Bug 8409 - f7d97868 created a new race condition

Summary: f7d97868 created a new race condition

Status:	RESOLVED FIXED

Alias:	None

Product:	Samba 3.6
Classification:	Unclassified
Component:	Winbind (show other bugs)
Version:	3.6.0
Hardware:	All All

Importance:	P5 normal
Target Milestone:	---
Assignee:	Karolin Seeger
QA Contact:	Samba QA Contact

URL:
Keywords:

Duplicates (1):	8667 (view as bug list)
Depends on:
Blocks:	8399
	Show dependency tree / graph

Reported:	2011-08-26 15:33 UTC by Volker Lendecke
Modified:	2011-12-16 17:57 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Patch (2.27 KB, patch) 2011-08-26 15:33 UTC, Volker Lendecke	jra: review+	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Volker Lendecke 2011-08-26 15:33:00 UTC

Created attachment 6813 [details]
Patch

Commit message on its way to autobuild:

Fix a winbind race leading to 100% CPU

This fixes a race condition that leads to the winbindd_children list becoming
corrupted. It happens when on a busy winbind SIGCHLD is a bit late.

Imagine a winbind with multiple requests in the queue for a single child. Child
dies, and before the SIGCHLD handler is called we find the socket to be dead.
wb_child_request_done is called, receiving an error from wb_simple_trans_recv.
It closes the socket. Then immediately the wb_child_request_trigger will do
another fork_domain_child before the signal handler is called. This means that
we do another fork_domain_child, we have child->sock==-1 at this point.
fork_domain_child will do a DLIST_ADD(winbindd_children, child) a second time
where the child is already part of that list. This corrupts the list. Then the
signal handler kicks in, spinning in

for (child = winbindd_children; child != NULL; child = child->next) {

forever. Not good. This patch makes sure that both conditions (sock==-1 and not
part of the list) for a winbindd_child struct match up.


This is not in 3.5

Comment 1 Jeremy Allison 2011-08-26 21:16:49 UTC

Comment on attachment 6813 [details]
Patch

Ok - went through this very carefully - this essential for 3.6.1. Thanks Volker !

Jeremy.

Comment 2 Christian Ambach 2011-08-27 09:22:56 UTC

should we add 
e0e3d21 vl@samba.org s3: Use sys_write in fork_domain_child
964e809 vl@samba.org s3: Use sys_read in fork_domain_child

as well?

Comment 3 Volker Lendecke 2011-08-28 09:08:48 UTC

Yes, please. Christian, ack?

Volker

Comment 4 Christian Ambach 2011-08-28 12:30:37 UTC

yes, to be on on a safer side in regards to signals

Comment 5 Karolin Seeger 2011-08-30 19:06:04 UTC

Pushed these three patches to v3-6-test.
Closing out bug report.

Thanks!

Comment 6 Herb Lewis 2011-12-16 17:57:41 UTC

*** Bug 8667 has been marked as a duplicate of this bug. ***