Bug 13641 - Fix CTDB recovery record resurrection from inactive nodes and simplify vacuuming
Summary: Fix CTDB recovery record resurrection from inactive nodes and simplify vacuuming
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: CTDB (show other bugs)
Version: 4.9.1
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-05 03:35 UTC by Martin Schwenke
Modified: 2018-10-11 10:15 UTC (History)
1 user (show)

See Also:


Attachments
Patch for 4.9 and 4.8 (39.17 KB, patch)
2018-10-08 22:40 UTC, Martin Schwenke
amitay: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Schwenke 2018-10-05 03:35:49 UTC
When a CTDB node becomes active after being stopped or banned, records that have been deleted and vacuumed on other nodes can be resurrected from newly active nodes during recovery.  This can be fixed by marking volatile databases as invalid when a node becomes inactive and avoiding pulling records from invalid databases.

Following this, vacuuming can be simplified by reverting to a 2 phase model.  The current 3 phase vacuuming was created to solve the record resurrection problem, which is now solved more generally.
Comment 1 Martin Schwenke 2018-10-08 22:40:22 UTC
Created attachment 14517 [details]
Patch for 4.9 and 4.8

Cherry picked cleanly from master.  Patch applies cleanly to both 4.8. and 4.9.  Compiles and new test passes in both branches.
Comment 2 Amitay Isaacs 2018-10-08 23:54:44 UTC
Hi Karolin,

This is ready for v4-8 and v4-9.

Thanks.
Comment 3 Karolin Seeger 2018-10-09 09:29:20 UTC
(In reply to Amitay Isaacs from comment #2)
Pushed to autobuild-v4-{8,9}-test.
Comment 4 Karolin Seeger 2018-10-11 10:15:03 UTC
(In reply to Karolin Seeger from comment #3)
Pushed to both branches.
Closing out bug report.

Thanks!