CTDB recovers persistent databases using the database from the node with the highest sequence number. Recovery helper sends GET_DB_SEQNUM control during recovery to read the sequence number. At that time the databases are frozen and CTDB daemon has started a tdb transaction on the database. CTDB daemon can only read a record from tdb using tdb_fetch() only if the tdb transaction has started. Otherwise, CTDB daemon will block in tdb_fetch() to get a lock on the record causing deadlock. This problem can be re-created also by stopping a node and trying to get the sequence number for any database.
Created attachment 13587 [details] Patches for v4-6
Created attachment 13588 [details] Patches for v4-7
Hi Karolin, This is ready for 4.6 and 4.7. Thanks...
(In reply to Martin Schwenke from comment #3) Pushed to autobuild-v4-{7,6}-test.
Pushed to both branches. Closing out bug report. Thanks!