10903 – when delete a node in the cluster,other node may down

Bug 10903 - when delete a node in the cluster,other node may down

Summary: when delete a node in the cluster,other node may down

Status:	RESOLVED INVALID

Alias:	None

Product:	CTDB 2.5.x or older
Classification:	Unclassified
Component:	ctdb (show other bugs)
Version:	2.5.3
Hardware:	x64 Linux

Importance:	P5 normal
Target Milestone:	---
Assignee:	Amitay Isaacs
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2014-10-28 11:11 UTC by fugx
Modified:	2016-09-12 09:11 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description fugx 2014-10-28 11:11:54 UTC

when delete a node on line, change the nodes config file #ip of this node,
and do ctdb reloadnodes on other nodes,
then, other nodes may down because of node->pending_controls timeout can free invalid request, the backtrace:


  
daemon_control_destructor
talloc_free
daemon_control_callback
ctdb_control_timeout

in daemon_control_destructor:

if (state->node) {
   DLIST_REMOVE(state->node->pending_controls, state);
}

but the node is free ctb_reload_nodes_event,so here will be error,

i think it should do call ctdb_daemon_cancel_controls in ctb_reload_nodes_event,

like this:

 if (ctdb->nodes[i]->flags & NODE_FLAGS_DELETED) {
   ctdb_daemon_cancel_controls(ctdb, nodes[i]); //add this line
   continue;
}

Comment 1 Martin Schwenke 2016-09-02 01:47:37 UTC

I think the key here is "delete a node on line".  I take that to mean that the node is being deleted when it is online.  The documentation has always said that CTDB should be shut down on a node that is about to be deleted.  This has been sanity checked by the ctdb tool since Samba 4.3.

Unless I'm misunderstanding this, should close as "invalid"?

Comment 2 Martin Schwenke 2016-09-12 09:11:05 UTC

Invalid.  Can't delete a node that is online/up.  CTDB needs to be shut down first.