2015/03/11 13:52:40.571327 [16667]: 172.16.223.60:4379: node 172.16.223.61:4379 is dead: 0 connected 2015/03/11 13:52:40.571367 [16667]: Tearing down connection to dead node :1 2015/03/11 13:52:41.104273 [recoverd:16839]: server/ctdb_recoverd.c:3960 The vnnmap count is different from the number of active lmaster nodes: 2 vs 1 2015/03/11 13:52:41.104314 [recoverd:16839]: server/ctdb_recoverd.c:1765 Starting do_recovery 2015/03/11 13:52:41.104320 [recoverd:16839]: Taking out recovery lock from recovery daemon 2015/03/11 13:52:41.104325 [recoverd:16839]: Take the recovery lock 2015/03/11 13:52:41.104510 [recoverd:16839]: Recovery lock taken successfully 2015/03/11 13:52:41.104521 [recoverd:16839]: ctdb_recovery_lock: Got recovery lock on '/users/local/ctdb.lock' 2015/03/11 13:52:41.104582 [recoverd:16839]: Recovery lock taken successfully by recovery daemon 2015/03/11 13:52:41.104592 [recoverd:16839]: server/ctdb_recoverd.c:1790 Recovery initiated due to problem with node 0 2015/03/11 13:52:41.104646 [recoverd:16839]: server/ctdb_recoverd.c:1815 Recovery - created remote databases 2015/03/11 13:52:41.105334 [recoverd:16839]: server/ctdb_recoverd.c:1822 Recovery - updated db priority for all databases 2015/03/11 13:52:41.105382 [16667]: Freeze priority 1 2015/03/11 13:52:51.106125 [16667]: Unable to get ALLDB locks for 10 seconds 2015/03/11 13:52:52.005566 [16667]: 172.16.223.60:4379: connected to 172.16.223.61:4379 - 1 connected 2015/03/11 13:52:52.611872 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:52:53.311494 [recoverd:16839]: server/ctdb_recoverd.c:1139 Election timed out 2015/03/11 13:52:56.816067 [recoverd:16839]: server/ctdb_recoverd.c:1139 Election timed out 2015/03/11 13:53:01.106923 [16667]: Unable to get ALLDB locks for 20 seconds 2015/03/11 13:53:07.612334 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:53:11.108204 [16667]: Unable to get ALLDB locks for 30 seconds 2015/03/11 13:53:21.108581 [16667]: Unable to get ALLDB locks for 40 seconds 2015/03/11 13:53:22.612606 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:53:31.109680 [16667]: Unable to get ALLDB locks for 50 seconds 2015/03/11 13:53:37.613527 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:53:41.104076 [16667]: Recovery daemon ping timeout. Count : 0 2015/03/11 13:53:41.106207 [recoverd:16839]: ctdb_control error: 'ctdb_control timed out' 2015/03/11 13:53:41.106223 [recoverd:16839]: ctdb_control error: 'ctdb_control timed out' 2015/03/11 13:53:41.106262 [recoverd:16839]: Async operation failed with ret=-1 res=-1 opcode=33 2015/03/11 13:53:41.106272 [recoverd:16839]: Failed to freeze node 0 during recovery. Set it as ban culprit for 2 credits 2015/03/11 13:53:41.106281 [recoverd:16839]: Async wait failed - fail_count=1 2015/03/11 13:53:41.106287 [recoverd:16839]: server/ctdb_recoverd.c:395 Unable to freeze nodes. Recovery failed. 2015/03/11 13:53:41.106292 [recoverd:16839]: server/ctdb_recoverd.c:1833 Unable to set recovery mode to active on cluster 2015/03/11 13:53:41.107767 [recoverd:16839]: server/ctdb_recoverd.c:1765 Starting do_recovery 2015/03/11 13:53:41.107780 [recoverd:16839]: Taking out recovery lock from recovery daemon 2015/03/11 13:53:41.107787 [recoverd:16839]: Take the recovery lock 2015/03/11 13:53:41.108131 [recoverd:16839]: Recovery lock taken successfully 2015/03/11 13:53:41.108146 [recoverd:16839]: ctdb_recovery_lock: Got recovery lock on '/users/local/ctdb.lock' 2015/03/11 13:53:41.108205 [recoverd:16839]: Recovery lock taken successfully by recovery daemon 2015/03/11 13:53:41.108215 [recoverd:16839]: server/ctdb_recoverd.c:1790 Recovery initiated due to problem with node 0 2015/03/11 13:53:41.110943 [16667]: Unable to get ALLDB locks for 60 seconds 2015/03/11 13:53:41.133244 [recoverd:16839]: server/ctdb_recoverd.c:1815 Recovery - created remote databases 2015/03/11 13:53:41.137952 [recoverd:16839]: server/ctdb_recoverd.c:1822 Recovery - updated db priority for all databases 2015/03/11 13:53:41.138251 [16667]: Freeze priority 1 2015/03/11 13:53:51.111205 [16667]: Unable to get ALLDB locks for 70 seconds 2015/03/11 13:53:52.614432 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:54:01.111639 [16667]: Unable to get ALLDB locks for 80 seconds 2015/03/11 13:54:07.614943 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:54:11.112299 [16667]: Unable to get ALLDB locks for 90 seconds 2015/03/11 13:54:21.113215 [16667]: Unable to get ALLDB locks for 100 seconds 2015/03/11 13:54:22.615444 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:54:31.114016 [16667]: Unable to get ALLDB locks for 110 seconds 2015/03/11 13:54:37.616152 [16667]: Skip monitoring since databases are frozen 2015/03/11 13:54:41.106864 [16667]: Recovery daemon ping timeout. Count : 0 2015/03/11 13:54:41.114980 [16667]: Unable to get ALLDB locks for 120 seconds 2015/03/11 13:54:41.138963 [recoverd:16839]: ctdb_control error: 'ctdb_control timed out' 2015/03/11 13:54:41.138985 [recoverd:16839]: ctdb_control error: 'ctdb_control timed out' 2015/03/11 13:54:41.138993 [recoverd:16839]: Async operation failed with ret=-1 res=-1 opcode=33 2015/03/11 13:54:41.138998 [recoverd:16839]: Failed to freeze node 0 during recovery. Set it as ban culprit for 2 credits 2015/03/11 13:54:41.139005 [recoverd:16839]: Async wait failed - fail_count=1 2015/03/11 13:54:41.139010 [recoverd:16839]: server/ctdb_recoverd.c:395 Unable to freeze nodes. Recovery failed. 2015/03/11 13:54:41.139037 [recoverd:16839]: server/ctdb_recoverd.c:1833 Unable to set recovery mode to active on cluster 2015/03/11 13:54:41.139245 [recoverd:16839]: Node 0 reached 5 banning credits - banning it for 300 seconds 2015/03/11 13:54:41.139255 [recoverd:16839]: Banning node 0 for 300 seconds 2015/03/11 13:54:41.139282 [16667]: Banning this node for 300 seconds 2015/03/11 13:54:41.139292 [16667]: This node has been banned - forcing freeze and recovery 2015/03/11 13:54:41.139297 [16667]: Freeze priority 1 2015/03/11 13:54:41.139304 [16667]: Freeze priority 2 2015/03/11 13:54:41.139437 [16667]: Freeze priority 3 2015/03/11 13:54:41.139538 [16667]: server/ctdb_takeover.c:3291 Released 0 public IPs 2015/03/11 13:54:41.139569 [recoverd:16839]: This node was banned, restart main_loop 2015/03/11 13:54:41.493575 [16667]: This node (0) is no longer the recovery master 2015/03/11 13:54:44.493295 [recoverd:16839]: server/ctdb_recoverd.c:1139 Election timed out 2015/03/11 13:54:44.498007 [16667]: This node has been banned - forcing freeze and recovery 2015/03/11 13:54:44.498027 [16667]: Freeze priority 1 2015/03/11 13:54:44.498034 [16667]: server/ctdb_takeover.c:3291 Released 0 public IPs 2015/03/11 13:54:44.498122 [recoverd:16839]: Node 0 has changed flags - now 0x8 was 0x0 2015/03/11 13:54:44.723373 [recoverd:16839]: Disabling takeover runs for 60 seconds 2015/03/11 13:54:44.771501 [recoverd:16839]: Reenabling takeover runs