There are 2 known remaining races: * When a node starts, a remote node remains incorrectly marked unhealthy Node A is starting and node B is healthy. A connects to the recovery master (neither A nor B) and gets a flag update for B. A then connects to B and marks B unhealthy. This is never fixed. 3 issues: - Remote node is marked unhealthy upon connect. This is unnecessary because this is already done at startup and disconnect. - Node accepts a flag update for a node it isn't connected to - unnecessary. - Although recovery master correctly pushes flag changes, it doesn't always update nodes that that have incorrect flags for a remote node. This is due to a regression in the fix for bug 14513. * Node disable/enable races still exist Although the race window is vanishingly small, it still exists. This was confirmed when fixing the above bug.
This bug was referenced in samba master: 82a075d4d734588a42fca7ebaf529892d1eba853 620d07871420cdbfa055c1ace75ec1ac4c32721d 8305f6a7f132f03b0bbdb26692b7491fd3f6c24f 49dc5d8cd2d3767044ac69cbd25c8210d11cadf7 6845dca87e6ffc5e449fb78d23eb9c7a22698b80 e0a7b5a9e866452b1faaed86a105492fe7b237e2 1ac7bc7532b2fad791d0e53effa7c64cdc73c4eb 60c1ef146538d90f97b7823459f7548ca5fa6dd3 15a6489c288b3adb635a728cb2049621ab1a07f7 6fe6a54e7f32e650be6ab36041159081dbde5165 5914054698dab934fd4db5efb9d211b2fdc40bb9 eec44e286250a6ee7b5c42d85d632bdc300a409f b6d25d079e30919457cacbfbbfd670bf88295a9c 0132bd5a2233193256af434a37506f86ed62c075 e75256767fffc6a7ac0b97e58737a39c63c8b187 916c5ee131dc5c7f1d9c3540147d1f915c8302ad ae10a8a4b70e53ea3be6257d1f86f2d9a56aa62a 7f697b1938efb3972f03f25546bf807d5af9a26c 9e7d2d9794af7251c42cb22f23ee9f86c6ea05c1
Created attachment 16799 [details] Patch for v4-15-test
Created attachment 16800 [details] Patch for v4-14-test
Created attachment 16801 [details] Patch for v4-13-test
Commits from master cherry-pick cleanly into all branches. Patch for v4-15-test and v4-14-test are identical, with the original v4-15-test version applying cleanly to v4-14-test. The v4-15-test patch didn't apply to v4-13-test but commits cherry-picked there without issue (with some line-number fuzz, I think). So, 3 patches for clarity. I ran CTDB local tests on all branches, all tests passed. I don't currently have a way of doing virtual cluster tests because of a KVM kernel bug on my test machine.
Hi Jule, This is ready for v4-15, v4-14 and v4-13. Thanks.
Pushed to autobuild-v4-{15,14,13}-test.
This bug was referenced in samba v4-15-test: f8fa33ac320a22dcac34f09bbea35af1aa804dfc 2cc4b917f78340f09e6d55efb0af97958c07fae3 c01d48d7a542bfb0b319fb18d0eea51b232ea62a 84a285851d7fea7843667e67ef317995e6c54bc5 675d68caabc59b5b47b744157173b4fc9476e32e 65d64194b6db3304a40585c8cb95f43e31c4222c c61b5e7b4890a96f3ea309017d9cbe8ce8e017fa b5f8913f359c24105e85c49fb0b1e476d0c2f353 8ed5910b8471c61149ddbc37c0aef8837d8a7029 772126bd68b1deb56c0b48e3c8b8530993cb866d 9f06ec8b108178ebd2c8d1e1fab9331383e30a52 e634ddde5e6518ecd9e5bcf36b210bb6f16e89a6 05d2f5e41c7a3e426c1be7bbe45913ef21c77728 17e0a052da07207ad063383fb1913794c12460a6 c8a9f9147c2215b14d9b666954948b592b646b12 f340dcbc675ec0efecaccf3a3258435dde85dd51 665b380d2490f312c7409a3c9d29572ad3664216 7c353e6e383b408de9d2823b32ff8e0527510d02 8d4c482410c4de451d26ce004247e9cc10aea832
This bug was referenced in samba v4-15-stable (Release samba-4.15.0rc7): f8fa33ac320a22dcac34f09bbea35af1aa804dfc 2cc4b917f78340f09e6d55efb0af97958c07fae3 c01d48d7a542bfb0b319fb18d0eea51b232ea62a 84a285851d7fea7843667e67ef317995e6c54bc5 675d68caabc59b5b47b744157173b4fc9476e32e 65d64194b6db3304a40585c8cb95f43e31c4222c c61b5e7b4890a96f3ea309017d9cbe8ce8e017fa b5f8913f359c24105e85c49fb0b1e476d0c2f353 8ed5910b8471c61149ddbc37c0aef8837d8a7029 772126bd68b1deb56c0b48e3c8b8530993cb866d 9f06ec8b108178ebd2c8d1e1fab9331383e30a52 e634ddde5e6518ecd9e5bcf36b210bb6f16e89a6 05d2f5e41c7a3e426c1be7bbe45913ef21c77728 17e0a052da07207ad063383fb1913794c12460a6 c8a9f9147c2215b14d9b666954948b592b646b12 f340dcbc675ec0efecaccf3a3258435dde85dd51 665b380d2490f312c7409a3c9d29572ad3664216 7c353e6e383b408de9d2823b32ff8e0527510d02 8d4c482410c4de451d26ce004247e9cc10aea832
This bug was referenced in samba v4-13-test: 76f8dffb527caa5e12a9a4922f4315bf8a5d2ac5 e93c885426dd1ad3e13750deda634c90e08bb2e5 74aa5b204e2e20b594b093342578151ab7cc3f9f ac8bbe2d0aeb5ed18816c0fabc125bef5ff609b0 3d797b570b024c5b490664fff3580bd54e39270d e3578ea22cb5dcd2bbba3d96fb9eeac52da55be9 65f9b5520d20ee404ffca87b282773fe171fe3d8 7aac8fd9e5e6ebd404f8eb7d568e5b3d7e11fa8b ce58aefb4ee23df9a1d8461e1ca3c55f43aa5889 75b8b5de3e835b5eeaeca7cc6100b1e538c88d9c c89f30810d3c036bbe8a0acc28b0d741ee2408be 85372296a7ed90f9873261a4e4ad5c6fb518c502 3d2313dc906b1794d1cc3235fcd10c7b6ea5d874 c4d7ed5eac4ddd971181af13f5ca32c443f0a79a 7c4daa7ffa05c2fb6ef710ba107cdb47a0e57811 3ab6be4f7bc672c719ea6891736ecc6448bab1be cc3ce341ee17d46bc8461b8628641d9f7c0c033c 479fc4fee0c78dd8e6fcab929480d08ec5ccfba2 cea68cbf537b6d44eb199126dc2ccf97fd3fff55
This bug was referenced in samba v4-14-test: 69f744e539f5be3123bef0ac9cf6dff84cb1779f c1e217c0e2ecff8c8005f2a225193884eb4c3fae c61fe558427bd532e9291a255528d45cd83c8393 88660d4e2f8efa137e9d5a99682b6060bcdb98eb 79961f5a33a43556d79fbafebbefb2baea8c1079 50596cf0029d2b027d537832bea8ca23cd4ccfc0 116db8d54f8a4b792c759e481571e384e32d7a82 e158aa6d9bd4eac72c5f529e51bff2a6ae3a1263 cb64c64ddb34512d3e347e99f197a299cd02a91a c8d130f139ad3da7880f2ca4b15aa485684d0f0b 00c1757d92e6e17d8c9e2ea6170e50a390e17c72 c906c9a0b393b98e2f914135bdf92cfe17e5b18a cfbac3b5ab942457f3d2aae8451bbe835a8d0648 e3eeffafff84b3b447d38ae03efa7dea9a91d199 eab3ee12fe01f9fc814e0fd92b28d13dd62c9bf1 a7ea1ab3e6a32cf1d6a6012f95ef5db7410ad78e 814844538aaf97aed54082b4d6b9e22b3fe9b220 2d6cf082db51cb5c2748d1cb893e2befc2ae56ef 551a39d890acb2405a1d1e011e56dc566e8a36f7
Closing out bug report. Thanks!
This bug was referenced in samba v4-13-stable (Release samba-4.13.12): 76f8dffb527caa5e12a9a4922f4315bf8a5d2ac5 e93c885426dd1ad3e13750deda634c90e08bb2e5 74aa5b204e2e20b594b093342578151ab7cc3f9f ac8bbe2d0aeb5ed18816c0fabc125bef5ff609b0 3d797b570b024c5b490664fff3580bd54e39270d e3578ea22cb5dcd2bbba3d96fb9eeac52da55be9 65f9b5520d20ee404ffca87b282773fe171fe3d8 7aac8fd9e5e6ebd404f8eb7d568e5b3d7e11fa8b ce58aefb4ee23df9a1d8461e1ca3c55f43aa5889 75b8b5de3e835b5eeaeca7cc6100b1e538c88d9c c89f30810d3c036bbe8a0acc28b0d741ee2408be 85372296a7ed90f9873261a4e4ad5c6fb518c502 3d2313dc906b1794d1cc3235fcd10c7b6ea5d874 c4d7ed5eac4ddd971181af13f5ca32c443f0a79a 7c4daa7ffa05c2fb6ef710ba107cdb47a0e57811 3ab6be4f7bc672c719ea6891736ecc6448bab1be cc3ce341ee17d46bc8461b8628641d9f7c0c033c 479fc4fee0c78dd8e6fcab929480d08ec5ccfba2 cea68cbf537b6d44eb199126dc2ccf97fd3fff55
This bug was referenced in samba v4-14-stable (Release samba-4.14.8): 69f744e539f5be3123bef0ac9cf6dff84cb1779f c1e217c0e2ecff8c8005f2a225193884eb4c3fae c61fe558427bd532e9291a255528d45cd83c8393 88660d4e2f8efa137e9d5a99682b6060bcdb98eb 79961f5a33a43556d79fbafebbefb2baea8c1079 50596cf0029d2b027d537832bea8ca23cd4ccfc0 116db8d54f8a4b792c759e481571e384e32d7a82 e158aa6d9bd4eac72c5f529e51bff2a6ae3a1263 cb64c64ddb34512d3e347e99f197a299cd02a91a c8d130f139ad3da7880f2ca4b15aa485684d0f0b 00c1757d92e6e17d8c9e2ea6170e50a390e17c72 c906c9a0b393b98e2f914135bdf92cfe17e5b18a cfbac3b5ab942457f3d2aae8451bbe835a8d0648 e3eeffafff84b3b447d38ae03efa7dea9a91d199 eab3ee12fe01f9fc814e0fd92b28d13dd62c9bf1 a7ea1ab3e6a32cf1d6a6012f95ef5db7410ad78e 814844538aaf97aed54082b4d6b9e22b3fe9b220 2d6cf082db51cb5c2748d1cb893e2befc2ae56ef 551a39d890acb2405a1d1e011e56dc566e8a36f7