Bug 14466 fixed a ctdb disable/enable race but left another behind.
The recovery daemon (node A) updates its flags from remote node B and pushes changed flags for B out to all nodes. However, if a third node (C) had different flags for node B then the recovery daemon's copy of these flags is not updated. However, the recovery daemon's main loop sanity checks C's potentially stale copy of flags for B with its own flags for B. If these differ then it can overwrite flags in an undesirable way, cancelling a disable/enable.
The recovery daemon is better off not doing such sanity check and should just depend on update_flags() working.
This bug was referenced in samba master:
Created attachment 16278 [details]
Patch for 4.13
Patch from master cherry-picks cleanly into v4-13-test. Successfully regression tested, no surprises.
Created attachment 16279 [details]
Patch for 4.12
Patch from master cherry-picks cleanly into v4-12-test - this is actually the same patch as for v4-12-test. Successfully regression tested, no surprises.
Created attachment 16280 [details]
Patch for 4.11
I'm not sure if there will be another bug fix 4.11 release. If not, please ignore this.
The final commit from master that strengthens the relevant testcase does not apply cleanly to v4-11-test so I have dropped it from this "backport", since it mostly exists to protect us from potential breakage from future changes.
The other 2 commits for the code changes apply cleanly. I did a smoke test (ran the relevant testcase against local daemons) and it passes as expected.
This is ready for v4-11 (if we are taking bug fixes), v4-12 and v4-13.