Bug 15981 - Invalid recovery lock due to the shared directory ever unavailable
Summary: Invalid recovery lock due to the shared directory ever unavailable
Status: NEEDINFO
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: CTDB (show other bugs)
Version: 4.12.7
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Martin Schwenke
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2026-01-23 03:49 UTC by Peng Sun
Modified: 2026-01-25 04:18 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peng Sun 2026-01-23 03:49:53 UTC
During the CTDB cluster is recovering, if the shared directory used by recovery file is also unavailable and then recovers, cluster will enter a loop recovery state.

Reproduce steps:
0. the CTDB cluster consists of three or more nodes
1. kill one ctdbd process
2. umount the shared directory of the recovery master node 
3. immediately remount
Comment 1 Martin Schwenke 2026-01-25 04:18:27 UTC
Samba 4.12.7 is no longer supported.  Please only report bugs against supported versions.

However...

The behaviour of the cluster lock has change a lot since 4.12.

Please see the Cluster Lock section in https://ctdb.samba.org/manpages/ctdb.7.html.  I wonder if specifying the recheck interval (e.g. 5s) will help.  Here is an example:

[cluster]
  cluster lock = !/usr/libexec/ctdb/ctdb_mutex_fcntl_helper /clusterfs/.cluster_lock 5

Please try something like this with a supported version.

Thanks...