Bug 11707 - CTDB crashes during database recovery
CTDB crashes during database recovery
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb
All All
: P5 major
: ---
Assigned To: Karolin Seeger
Samba QA Contact
Depends on:
  Show dependency treegraph
Reported: 2016-02-03 00:17 UTC by Amitay Isaacs
Modified: 2016-02-17 08:25 UTC (History)
3 users (show)

See Also:

Patch for v4-4 branch (7.21 KB, patch)
2016-02-09 13:16 UTC, Amitay Isaacs
martins: review+

Note You need to log in before you can comment on or make changes to this bug.
Description Amitay Isaacs 2016-02-03 00:17:06 UTC
CTDB aborts if the following sequence of events happen.
  - CTDB gets REQ_DMASTER packet (gen1)
    This packet processing gets deferred to get a record lock
  - CTDB goes into recovery, marks RECOVERY_ACTIVE
    CTDB recovery helper updates vnnmap (gen2)
  - CTDB processes REQ_DMASTER packet (gen1)
    The check against database generation (gen1) succeeds.
    The check for lmaster is now invalid because VNNMAP has changed.
    This will cause CTDB to abort due to protocol error.
Comment 1 Amitay Isaacs 2016-02-09 13:16:36 UTC
Created attachment 11817 [details]
Patch for v4-4 branch
Comment 2 Amitay Isaacs 2016-02-10 05:45:13 UTC
Hi Karolin,

This one is for v4-4 branch.
Comment 3 Karolin Seeger 2016-02-15 09:36:27 UTC
Pushed to autobuild-v4-4-test.

Please note:

Applying: Revert "ctdb-daemon: Check packet generation against database generation"
/data/git/samba/v4-4-test/.git/rebase-apply/patch:121: trailing whitespace.
/data/git/samba/v4-4-test/.git/rebase-apply/patch:147: trailing whitespace.
				" generation id is:%u\n", 
/data/git/samba/v4-4-test/.git/rebase-apply/patch:149: trailing whitespace.
/data/git/samba/v4-4-test/.git/rebase-apply/patch:150: trailing whitespace.
				 hdr->srcnode, hdr->destnode, 
warning: 4 lines add whitespace errors.
Comment 4 Karolin Seeger 2016-02-17 08:25:03 UTC
Pushed to v4-4-test.
Closing out bug report.