Bug 14784 - More CTDB flag update races
Summary: More CTDB flag update races
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: CTDB (show other bugs)
Version: 4.15.0rc1
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-06 02:17 UTC by Martin Schwenke
Modified: 2021-09-15 07:12 UTC (History)
1 user (show)

See Also:


Attachments
Patch for v4-15-test (44.59 KB, patch)
2021-09-09 11:53 UTC, Martin Schwenke
amitay: review+
Details
Patch for v4-14-test (44.59 KB, patch)
2021-09-09 11:54 UTC, Martin Schwenke
amitay: review+
Details
Patch for v4-13-test (44.59 KB, patch)
2021-09-09 11:55 UTC, Martin Schwenke
amitay: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Schwenke 2021-08-06 02:17:21 UTC
There are 2 known remaining races:

* When a node starts, a remote node remains incorrectly marked unhealthy

  Node A is starting and node B is healthy.  A connects to the recovery
  master (neither A nor B) and gets a flag update for B.  A then connects
  to B and marks B unhealthy.  This is never fixed.

  3 issues:

  - Remote node is marked unhealthy upon connect.  This is unnecessary
    because this is already done at startup and disconnect.

  - Node accepts a flag update for a node it isn't connected to - unnecessary.

  - Although recovery master correctly pushes flag changes, it doesn't always
    update nodes that that have incorrect flags for a remote node.  This is due
    to a regression in the fix for bug 14513.

* Node disable/enable races still exist

  Although the race window is vanishingly small, it still exists.  This was
  confirmed when fixing the above bug.
Comment 1 Samba QA Contact 2021-09-09 02:39:03 UTC
This bug was referenced in samba master:

82a075d4d734588a42fca7ebaf529892d1eba853
620d07871420cdbfa055c1ace75ec1ac4c32721d
8305f6a7f132f03b0bbdb26692b7491fd3f6c24f
49dc5d8cd2d3767044ac69cbd25c8210d11cadf7
6845dca87e6ffc5e449fb78d23eb9c7a22698b80
e0a7b5a9e866452b1faaed86a105492fe7b237e2
1ac7bc7532b2fad791d0e53effa7c64cdc73c4eb
60c1ef146538d90f97b7823459f7548ca5fa6dd3
15a6489c288b3adb635a728cb2049621ab1a07f7
6fe6a54e7f32e650be6ab36041159081dbde5165
5914054698dab934fd4db5efb9d211b2fdc40bb9
eec44e286250a6ee7b5c42d85d632bdc300a409f
b6d25d079e30919457cacbfbbfd670bf88295a9c
0132bd5a2233193256af434a37506f86ed62c075
e75256767fffc6a7ac0b97e58737a39c63c8b187
916c5ee131dc5c7f1d9c3540147d1f915c8302ad
ae10a8a4b70e53ea3be6257d1f86f2d9a56aa62a
7f697b1938efb3972f03f25546bf807d5af9a26c
9e7d2d9794af7251c42cb22f23ee9f86c6ea05c1
Comment 2 Martin Schwenke 2021-09-09 11:53:57 UTC
Created attachment 16799 [details]
Patch for v4-15-test
Comment 3 Martin Schwenke 2021-09-09 11:54:34 UTC
Created attachment 16800 [details]
Patch for v4-14-test
Comment 4 Martin Schwenke 2021-09-09 11:55:13 UTC
Created attachment 16801 [details]
Patch for v4-13-test
Comment 5 Martin Schwenke 2021-09-09 11:59:41 UTC
Commits from master cherry-pick cleanly into all branches.  Patch for v4-15-test and v4-14-test are identical, with the original v4-15-test version applying cleanly to v4-14-test.  The v4-15-test patch didn't apply to v4-13-test but commits cherry-picked there without issue (with some line-number fuzz, I think).  So, 3 patches for clarity.

I ran CTDB local tests on all branches, all tests passed.  I don't currently have a way of doing virtual cluster tests because of a KVM kernel bug on my test machine.
Comment 6 Amitay Isaacs 2021-09-13 10:17:03 UTC
Hi Jule,

This is ready for v4-15, v4-14 and v4-13.

Thanks.
Comment 7 Jule Anger 2021-09-13 11:51:57 UTC
Pushed to autobuild-v4-{15,14,13}-test.
Comment 8 Samba QA Contact 2021-09-13 12:34:03 UTC
This bug was referenced in samba v4-15-test:

f8fa33ac320a22dcac34f09bbea35af1aa804dfc
2cc4b917f78340f09e6d55efb0af97958c07fae3
c01d48d7a542bfb0b319fb18d0eea51b232ea62a
84a285851d7fea7843667e67ef317995e6c54bc5
675d68caabc59b5b47b744157173b4fc9476e32e
65d64194b6db3304a40585c8cb95f43e31c4222c
c61b5e7b4890a96f3ea309017d9cbe8ce8e017fa
b5f8913f359c24105e85c49fb0b1e476d0c2f353
8ed5910b8471c61149ddbc37c0aef8837d8a7029
772126bd68b1deb56c0b48e3c8b8530993cb866d
9f06ec8b108178ebd2c8d1e1fab9331383e30a52
e634ddde5e6518ecd9e5bcf36b210bb6f16e89a6
05d2f5e41c7a3e426c1be7bbe45913ef21c77728
17e0a052da07207ad063383fb1913794c12460a6
c8a9f9147c2215b14d9b666954948b592b646b12
f340dcbc675ec0efecaccf3a3258435dde85dd51
665b380d2490f312c7409a3c9d29572ad3664216
7c353e6e383b408de9d2823b32ff8e0527510d02
8d4c482410c4de451d26ce004247e9cc10aea832
Comment 9 Samba QA Contact 2021-09-13 13:46:37 UTC
This bug was referenced in samba v4-15-stable (Release samba-4.15.0rc7):

f8fa33ac320a22dcac34f09bbea35af1aa804dfc
2cc4b917f78340f09e6d55efb0af97958c07fae3
c01d48d7a542bfb0b319fb18d0eea51b232ea62a
84a285851d7fea7843667e67ef317995e6c54bc5
675d68caabc59b5b47b744157173b4fc9476e32e
65d64194b6db3304a40585c8cb95f43e31c4222c
c61b5e7b4890a96f3ea309017d9cbe8ce8e017fa
b5f8913f359c24105e85c49fb0b1e476d0c2f353
8ed5910b8471c61149ddbc37c0aef8837d8a7029
772126bd68b1deb56c0b48e3c8b8530993cb866d
9f06ec8b108178ebd2c8d1e1fab9331383e30a52
e634ddde5e6518ecd9e5bcf36b210bb6f16e89a6
05d2f5e41c7a3e426c1be7bbe45913ef21c77728
17e0a052da07207ad063383fb1913794c12460a6
c8a9f9147c2215b14d9b666954948b592b646b12
f340dcbc675ec0efecaccf3a3258435dde85dd51
665b380d2490f312c7409a3c9d29572ad3664216
7c353e6e383b408de9d2823b32ff8e0527510d02
8d4c482410c4de451d26ce004247e9cc10aea832
Comment 10 Samba QA Contact 2021-09-13 14:13:03 UTC
This bug was referenced in samba v4-13-test:

76f8dffb527caa5e12a9a4922f4315bf8a5d2ac5
e93c885426dd1ad3e13750deda634c90e08bb2e5
74aa5b204e2e20b594b093342578151ab7cc3f9f
ac8bbe2d0aeb5ed18816c0fabc125bef5ff609b0
3d797b570b024c5b490664fff3580bd54e39270d
e3578ea22cb5dcd2bbba3d96fb9eeac52da55be9
65f9b5520d20ee404ffca87b282773fe171fe3d8
7aac8fd9e5e6ebd404f8eb7d568e5b3d7e11fa8b
ce58aefb4ee23df9a1d8461e1ca3c55f43aa5889
75b8b5de3e835b5eeaeca7cc6100b1e538c88d9c
c89f30810d3c036bbe8a0acc28b0d741ee2408be
85372296a7ed90f9873261a4e4ad5c6fb518c502
3d2313dc906b1794d1cc3235fcd10c7b6ea5d874
c4d7ed5eac4ddd971181af13f5ca32c443f0a79a
7c4daa7ffa05c2fb6ef710ba107cdb47a0e57811
3ab6be4f7bc672c719ea6891736ecc6448bab1be
cc3ce341ee17d46bc8461b8628641d9f7c0c033c
479fc4fee0c78dd8e6fcab929480d08ec5ccfba2
cea68cbf537b6d44eb199126dc2ccf97fd3fff55
Comment 11 Samba QA Contact 2021-09-14 07:38:12 UTC
This bug was referenced in samba v4-14-test:

69f744e539f5be3123bef0ac9cf6dff84cb1779f
c1e217c0e2ecff8c8005f2a225193884eb4c3fae
c61fe558427bd532e9291a255528d45cd83c8393
88660d4e2f8efa137e9d5a99682b6060bcdb98eb
79961f5a33a43556d79fbafebbefb2baea8c1079
50596cf0029d2b027d537832bea8ca23cd4ccfc0
116db8d54f8a4b792c759e481571e384e32d7a82
e158aa6d9bd4eac72c5f529e51bff2a6ae3a1263
cb64c64ddb34512d3e347e99f197a299cd02a91a
c8d130f139ad3da7880f2ca4b15aa485684d0f0b
00c1757d92e6e17d8c9e2ea6170e50a390e17c72
c906c9a0b393b98e2f914135bdf92cfe17e5b18a
cfbac3b5ab942457f3d2aae8451bbe835a8d0648
e3eeffafff84b3b447d38ae03efa7dea9a91d199
eab3ee12fe01f9fc814e0fd92b28d13dd62c9bf1
a7ea1ab3e6a32cf1d6a6012f95ef5db7410ad78e
814844538aaf97aed54082b4d6b9e22b3fe9b220
2d6cf082db51cb5c2748d1cb893e2befc2ae56ef
551a39d890acb2405a1d1e011e56dc566e8a36f7
Comment 12 Jule Anger 2021-09-15 07:12:40 UTC
Closing out bug report.

Thanks!