Bug 11707 - CTDB crashes during database recovery
Summary: CTDB crashes during database recovery
Status: RESOLVED FIXED
Alias: None
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb (show other bugs)
Version: 4.4.0rc
Hardware: All All
: P5 major
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-03 00:17 UTC by Amitay Isaacs
Modified: 2016-02-17 08:25 UTC (History)
3 users (show)

See Also:


Attachments
Patch for v4-4 branch (7.21 KB, patch)
2016-02-09 13:16 UTC, Amitay Isaacs
martins: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Amitay Isaacs 2016-02-03 00:17:06 UTC
CTDB aborts if the following sequence of events happen.
    
  - CTDB gets REQ_DMASTER packet (gen1)
    This packet processing gets deferred to get a record lock
    
  - CTDB goes into recovery, marks RECOVERY_ACTIVE
    CTDB recovery helper updates vnnmap (gen2)
    
  - CTDB processes REQ_DMASTER packet (gen1)
    The check against database generation (gen1) succeeds.
    The check for lmaster is now invalid because VNNMAP has changed.
    This will cause CTDB to abort due to protocol error.
Comment 1 Amitay Isaacs 2016-02-09 13:16:36 UTC
Created attachment 11817 [details]
Patch for v4-4 branch
Comment 2 Amitay Isaacs 2016-02-10 05:45:13 UTC
Hi Karolin,

This one is for v4-4 branch.
Comment 3 Karolin Seeger 2016-02-15 09:36:27 UTC
Pushed to autobuild-v4-4-test.

Please note:

Applying: Revert "ctdb-daemon: Check packet generation against database generation"
/data/git/samba/v4-4-test/.git/rebase-apply/patch:121: trailing whitespace.
	
/data/git/samba/v4-4-test/.git/rebase-apply/patch:147: trailing whitespace.
				" generation id is:%u\n", 
/data/git/samba/v4-4-test/.git/rebase-apply/patch:149: trailing whitespace.
				 hdr->length, 
/data/git/samba/v4-4-test/.git/rebase-apply/patch:150: trailing whitespace.
				 hdr->srcnode, hdr->destnode, 
warning: 4 lines add whitespace errors.
Comment 4 Karolin Seeger 2016-02-17 08:25:03 UTC
Pushed to v4-4-test.
Closing out bug report.

Thanks!