Strange bug is preventing one of my nodes (10.0.20.21) from accessing the public IP (10.0.20.30) of the CTDB cluster. The arp request for the public address appears to be ignored. The other two nodes have the correct ARP entry The three nodes are identical and were based on the instructions found here: http://community.redhat.com/blog/2014/11/up-and-running-with-ovirt-3-5-part-two/ (with minor changes to test ovirt 3.5.2 on CentOS 7.1) _ISSUE_ # ip -s neighbour list 10.0.20.22 dev ovirtmgmt lladdr c0:3f:d5:63:83:fa ref 1 used 4025/0/4025 probes 4 REACHABLE 10.0.20.23 dev ovirtmgmt lladdr c0:3f:d5:64:0a:0b ref 1 used 4025/0/30 probes 4 REACHABLE 10.0.20.30 dev ovirtmgmt used 3206/3245/3203 probes 6 FAILED _ENVIRONMENT_ # ctdb status Number of nodes:3 pnn:0 10.0.20.21 OK (THIS NODE) pnn:1 10.0.20.22 OK pnn:2 10.0.20.23 OK ... # cat /etc/centos-release CentOS Linux release 7.1.1503 (Core) # ctdb version CTDB version: 2.5.1 _TROUBLESHOOTING_ - SELinux and firewalld have been disabled on all three nodes during testing - I have rotated the public address across the three nodes, same result (even when on this troublesome host). - # tcpdump -i ovirtmgmt -vv -nn arp tcpdump: listening on ovirtmgmt, link-type EN10MB (Ethernet), capture size 65535 bytes 19:51:43.225065 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.20.30 tell 10.0.20.21, length 28 19:51:44.227060 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.20.30 tell 10.0.20.21, length 28 19:51:52.223422 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.20.30 tell 10.0.20.21, length 28 19:51:53.225062 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.20.30 tell 10.0.20.21, length 28 19:51:54.227080 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.20.30 tell 10.0.20.21, length 28 - # ip -s neighbour flush dev ovirtmgmt - # ip neighbour delete 10.0.20.30 dev ovirtmgmt - # shutdown -r now Any advice would be great.
Looks like you are using the same subnet for CTDB's management IP addresses (10.0.20.21/22/23) and public IP addresses (10.0.20.30). Usually management network should be separate from public IP network. Are you using same Ethernet interface or different Ethernet interfaces? Some more information about your CTDB configuration would be useful: - /etc/ctdb/nodes - public addresses file - CTDB configuration
Required information not received from submitter, so closing as invalid. Method for resolving this would be: 1. Use "ctdb ip" to confirm that CTDB thinks it is hosting the address. 2. Use "ip addr show" to confirm that the address is actually on the expected network interface. If both of these are as expected then there's nothing CTDB can do. The network stack should respond to ARPs. If one or both are not as expected then would need to look at logs and do other debugging to determine why...