Bug 7768 - Possibility to set node priority for IP takeover/failback
Summary: Possibility to set node priority for IP takeover/failback
Status: NEW
Alias: None
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb (show other bugs)
Version: unspecified
Hardware: All Linux
: P3 enhancement
Target Milestone: ---
Assignee: Michael Adam
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-01 15:38 UTC by Michal Strnad
Modified: 2011-06-24 07:07 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michal Strnad 2010-11-01 15:38:18 UTC
Hello all,

I would like to introduce one nice-to-have option for CTDB which would allow to configure node priorities for IP takeover in case of some cluster node fails.

In our scenario we have 4 nodes in a GPFS cluster geographically spread in two locations.
Two of these nodes (one in every location) are more powerful thus expected to be "primary" nodes serving clients with pair of public IP addresses.
The other two nodes serve as a "backup" nodes. They are members of underlying GPFS cluster which do some maintenance tasks (backups, ILM migrations etc.) and should be able to take over the CTDB public addresses only in case both the primary nodes failed. In case only one of the primary nodes failed, it's IP address should rather be moved to the other primary than to any of the secondary nodes. As far as I know, there's no clear way to achieve such behavior now.

The above mentioned could be relatively easily implemented using some weight/priority factor specified in the /etc/ctdb/nodes file or elsewhere which would be taken into account during the recovery.
Comment 1 Michal Strnad 2010-11-02 00:54:47 UTC
Other option would be to specify a list of cluster nodes the IP address can reside on (in order of preference) in the /etc/ctdb/Public_addresses.
Comment 2 Luk Claes (dead mail address) 2011-05-13 21:33:02 UTC
You can list more public addresses for nodes that are more powerful, the drawback is obviously that not all the public addresses would be taken over in the case that the powerful nodes all fail though.

So for instance:

the powerful nodes' /etc/ctdb/public_addresses:
10.0.0.1 eth0
10.0.0.2 eth0
10.0.0.3 eth0
10.0.0.4 eth0

the "backup" nodes' /etc/ctdb/public_addresses:
10.0.0.1 eth0
10.0.0.2 eth0
Comment 3 Michal Strnad 2011-05-14 06:20:51 UTC
(In reply to comment #2)
> You can list more public addresses for nodes that are more powerful, the
> drawback is obviously that not all the public addresses would be taken over in
> the case that the powerful nodes all fail though.
> 
> So for instance:
> 
> the powerful nodes' /etc/ctdb/public_addresses:
> 10.0.0.1 eth0
> 10.0.0.2 eth0
> 10.0.0.3 eth0
> 10.0.0.4 eth0
> 
> the "backup" nodes' /etc/ctdb/public_addresses:
> 10.0.0.1 eth0
> 10.0.0.2 eth0

That would help to spread the load according to the nodes' performace, but does not solve the problem as we want to avoid clients using the "backup" nodes as long as there are some "primary" nodes available.

As a workaround we start the "backup nodes in disabled state and are working on a monitor script which will check the health of both primary nodes and enable the nodes in case "primary" nodes become unhealthy.