CTDB aborts if the following sequence of events happen. - CTDB gets REQ_DMASTER packet (gen1) This packet processing gets deferred to get a record lock - CTDB goes into recovery, marks RECOVERY_ACTIVE CTDB recovery helper updates vnnmap (gen2) - CTDB processes REQ_DMASTER packet (gen1) The check against database generation (gen1) succeeds. The check for lmaster is now invalid because VNNMAP has changed. This will cause CTDB to abort due to protocol error.
Created attachment 11817 [details] Patch for v4-4 branch
Hi Karolin, This one is for v4-4 branch.
Pushed to autobuild-v4-4-test. Please note: Applying: Revert "ctdb-daemon: Check packet generation against database generation" /data/git/samba/v4-4-test/.git/rebase-apply/patch:121: trailing whitespace. /data/git/samba/v4-4-test/.git/rebase-apply/patch:147: trailing whitespace. " generation id is:%u\n", /data/git/samba/v4-4-test/.git/rebase-apply/patch:149: trailing whitespace. hdr->length, /data/git/samba/v4-4-test/.git/rebase-apply/patch:150: trailing whitespace. hdr->srcnode, hdr->destnode, warning: 4 lines add whitespace errors.
Pushed to v4-4-test. Closing out bug report. Thanks!