Bug 9268 - Make tdb robust against improper CLEAR_IF_FIRST restart
Summary: Make tdb robust against improper CLEAR_IF_FIRST restart
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: Clustering (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-08 18:38 UTC by Jeremy Allison
Modified: 2012-10-09 07:24 UTC (History)
0 users

See Also:


Attachments
git-am fix for 4.0.0rc3. (10.20 KB, patch)
2012-10-08 18:39 UTC, Jeremy Allison
no flags Details
git-am for 3.6.next. (8.26 KB, patch)
2012-10-08 19:26 UTC, Jeremy Allison
vl: review+
Details
Correct fix for 4.0.0rc3. (9.28 KB, patch)
2012-10-08 20:28 UTC, Jeremy Allison
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Allison 2012-10-08 18:38:47 UTC
From Volker's patch:

When winbind is restarted, there is a potential crash in tdb. Following
situation: We are in a cluster with ctdb. A winbind child hangs
in a request to the DC. Cluster monitoring decides the node has a
problem. Cluster monitoring decides to kill ctdbd. winbind child
still hangs in a RPC request. winbind parent figures that ctdb is
dead and immediately commits suicide. winbind parent is restarted by
cluster management, overwriting gencache.tdb with CLEAR_IF_FIRST. The
CLEAR_IF_FIRST logic as implemented now will not see that a child still
has the tdb open, only the parent holds the ACTIVE_LOCK due to performance
reasons. During the CLEAR_IF_FIRST logic is done, there is a very small
window where we ftruncate(tfd, 0) the file and re-write a proper header
without a lock. When during this small window the winbind child comes
back, wanting to store something into gencache.tdb, that winbind child
will crash with a SIGBUS.
Comment 1 Jeremy Allison 2012-10-08 18:39:24 UTC
Created attachment 8009 [details]
git-am fix for 4.0.0rc3.

Same fix that went into master.
Jeremy.
Comment 2 Jeremy Allison 2012-10-08 19:26:07 UTC
Created attachment 8010 [details]
git-am for 3.6.next.

Back-port from master.
Jeremy.
Comment 3 Volker Lendecke 2012-10-08 20:11:57 UTC
Comment on attachment 8009 [details]
git-am fix for 4.0.0rc3.

Are you sure you want the last patch in the patchset under this bug report?
Comment 4 Jeremy Allison 2012-10-08 20:27:05 UTC
Comment on attachment 8009 [details]
git-am fix for 4.0.0rc3.

Oh I missed that - will resubmit.
Comment 5 Jeremy Allison 2012-10-08 20:28:24 UTC
Created attachment 8011 [details]
Correct fix for 4.0.0rc3.

Now without extraneous patch :-).
Comment 6 Karolin Seeger 2012-10-09 07:24:40 UTC
Pushed to v3-6-test and autobuild-v4-0-test.
Closing out bug report.

Thanks!