when delete a node on line, change the nodes config file #ip of this node, and do ctdb reloadnodes on other nodes, then, other nodes may down because of node->pending_controls timeout can free invalid request, the backtrace: daemon_control_destructor talloc_free daemon_control_callback ctdb_control_timeout in daemon_control_destructor: if (state->node) { DLIST_REMOVE(state->node->pending_controls, state); } but the node is free ctb_reload_nodes_event,so here will be error, i think it should do call ctdb_daemon_cancel_controls in ctb_reload_nodes_event, like this: if (ctdb->nodes[i]->flags & NODE_FLAGS_DELETED) { ctdb_daemon_cancel_controls(ctdb, nodes[i]); //add this line continue; }
I think the key here is "delete a node on line". I take that to mean that the node is being deleted when it is online. The documentation has always said that CTDB should be shut down on a node that is about to be deleted. This has been sanity checked by the ctdb tool since Samba 4.3. Unless I'm misunderstanding this, should close as "invalid"?
Invalid. Can't delete a node that is online/up. CTDB needs to be shut down first.