Bug 14324 - CTDB recovery daemon can crash due to dereference of NULL pointer
Summary: CTDB recovery daemon can crash due to dereference of NULL pointer
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: CTDB (show other bugs)
Version: 4.12.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-23 08:51 UTC by Martin Schwenke
Modified: 2020-03-31 13:57 UTC (History)
1 user (show)

See Also:


Attachments
Patch for 4.12, 4.11 (2.14 KB, application/mbox)
2020-03-30 01:21 UTC, Martin Schwenke
amitay: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Schwenke 2020-03-23 08:51:30 UTC
main_loop() contains this code:

	TALLOC_FREE(rec->nodemap);
	ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), pnn, rec, &rec->nodemap);

The 2nd line contains a nested event loop that waits for the
reply to the control.  This event loop can invoke message
handlers that do not expect rec->nodemap to be NULL.

One example is lost_reclock_handler(), which causes rec->nodemap to be
unconditionally dereferenced in list_of_nodes() via this call chain:
    
      list_of_nodes()
      list_of_active_nodes()
      set_recovery_mode()
      force_election()
      lost_reclock_handler()

This causes the CTDB recovery daemon to crash sometimes when the
recovery lock is lost.  There are also other handlers that
unconditionally reference rec->nodemap.
Comment 1 Martin Schwenke 2020-03-30 01:21:19 UTC
Created attachment 15872 [details]
Patch for 4.12, 4.11

Cherry picks cleanly from master into both branches.
Comment 2 Amitay Isaacs 2020-03-30 01:51:25 UTC
Hi Karolin,

This is ready for v4-11 and v4-12.

Thanks.
Comment 3 Karolin Seeger 2020-03-30 08:06:13 UTC
(In reply to Amitay Isaacs from comment #2)
Pushed to autobuild-v4-{12,11}-test.
Comment 4 Karolin Seeger 2020-03-31 13:57:58 UTC
Pushed to both branches.
Closing out bug report.

Thanks!