There are a lot of race conditions in the current implementation of API level tdb transactions in samba/ctdb, i.e. in the ctdb backend to dbwrap. Under high load (concurrent transactions), transaction_commit calls fail and can in the worst case lead to corrupted local copies of ctdb databases. This problem has been debugged and fixed with a lot of pain in ctdb and the clustered samba branch v3-4-ctdb under git://git.samba.org/obnox/samba-ctdb.git Transactions were finally fixed around January 2010 and the corresponding changes have been pushed to master already in a sequence of commits. I will prepare a patchset that backports the fixes to 3.5 (which will mainly consist of picking the right commits from master) and attach it to this bug for review regarding inclusion in 3.5.2. Michael
Created attachment 5581 [details] Patchset to fix the transactions for 3.5 These are the transaction fixes that have been tested in the v3-4-ctdb branch and already pushed to master. These consist in the introduction of the global lock feature (g_lock) and a modification of the dbwrap_ctdb transaction code to use these global locks. The ctdb control CTDB_TRANS3_COMMIT that is new in ctdb version 1.0.109 is used here.
(In reply to comment #1) > Created an attachment (id=5581) [details] > Patchset to fix the transactions for 3.5 > > These are the transaction fixes that have been tested in the v3-4-ctdb branch > and already pushed to master. These consist in the introduction of the global > lock feature (g_lock) and a modification of the dbwrap_ctdb transaction code to > use these global locks. The ctdb control CTDB_TRANS3_COMMIT that is new in ctdb > version 1.0.109 is used here. Note that this patchset depends on the patch for bug #7312. (Not the content but the patch does not apply without a minor modification without first applying the patch of that bug.)
Pushed to v3-5-test. Closing out bug report. Thanks!