Bug 15937 - winbindd crashes with Bad talloc magic value - unknown value
Summary: winbindd crashes with Bad talloc magic value - unknown value
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: DCE-RPCs and pipes (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba release manager
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-10-24 02:11 UTC by Gary Lockyer
Modified: 2026-01-23 11:39 UTC (History)
4 users (show)

See Also:


Attachments
Logs with a full stack trace. (60.31 KB, text/x-log)
2025-10-24 02:11 UTC, Gary Lockyer
no flags Details
Proposed fix (version 1) (1.60 KB, patch)
2026-01-06 21:47 UTC, Gary Lockyer
gary: ci-passed+
Details
Proposed fix (version 2) (1.68 KB, patch)
2026-01-12 19:26 UTC, Gary Lockyer
no flags Details
patch for 4.23 (1.93 KB, patch)
2026-01-14 02:19 UTC, Douglas Bagnall
gary: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gary Lockyer 2025-10-24 02:11:08 UTC
Created attachment 18763 [details]
Logs with a full stack trace.

2025-10-24T01:58:52.551756+00:00 addc.addom.samba.example.com winbindd[306061]:   wbd_ping_dc_done: dcerpc_wbint_PingDc_recv failed for domain: TORTURE305 - NT
_STATUS_DOMAIN_CONTROLLER_NOT_FOUND
2025-10-24T01:58:52.551854+00:00 addc.addom.samba.example.com winbindd[306061]:   free_domain: Free updated domain[0x58ce4dc1a4d0] name[TORTURE305] S-1-5-21-97
398-379795-305 replaced by domain[0x58ce4cdb7790] name[TORTURE305]
2025-10-24T01:58:52.558471+00:00 addc.addom.samba.example.com winbindd[306061]:   Bad talloc magic value - unknown value
2025-10-24T01:58:52.558544+00:00 addc.addom.samba.example.com winbindd[306061]:   ===============================================================
2025-10-24T01:58:52.558558+00:00 addc.addom.samba.example.com winbindd[306061]:   INTERNAL ERROR: Bad talloc magic value - unknown value in winbindd () () pid
306061 (4.24.0pre1-DEVELOPERBUILD)
2025-10-24T01:58:52.558573+00:00 addc.addom.samba.example.com winbindd[306061]:   If you are running a recent Samba version, and if you think this problem is n
ot yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
2025-10-24T01:58:52.558588+00:00 addc.addom.samba.example.com winbindd[306061]:   ===============================================================
2025-10-24T01:58:52.558598+00:00 addc.addom.samba.example.com winbindd[306061]:   PANIC (pid 306061): Bad talloc magic value - unknown value in 4.24.0pre1-DEVE
LOPERBUILD
2025-10-24T01:58:52.558772+00:00 addc.addom.samba.example.com winbindd[306061]:   BACKTRACE: 16 stack frames:
   #0 bin/shared/private/libgenrand-private-samba.so(log_stack_trace+0x29) [0x7398e741ce59]
   #1 bin/shared/private/libgenrand-private-samba.so(smb_panic_log+0x256) [0x7398e741ce26]
   #2 bin/shared/private/libgenrand-private-samba.so(smb_panic+0x15) [0x7398e741cfe5]
   #3 bin/shared/private/libtalloc-private-samba.so(+0x9dca) [0x7398e7a60dca]
   #4 bin/shared/private/libtalloc-private-samba.so(+0x9d80) [0x7398e7a60d80]
   #5 bin/shared/private/libtalloc-private-samba.so(+0x497d) [0x7398e7a5b97d]
   #6 bin/shared/private/libtalloc-private-samba.so(+0x5ad5) [0x7398e7a5cad5]
   #7 bin/shared/private/libtalloc-private-samba.so(talloc_check_name+0x3c) [0x7398e7a5cb8c]
   #8 bin/shared/private/libtevent-private-samba.so(+0x1a7ac) [0x7398e83007ac]
   #9 bin/shared/private/libtevent-private-samba.so(+0x17e18) [0x7398e82fde18]
   #10 bin/shared/private/libtevent-private-samba.so(+0x16120) [0x7398e82fc120]
   #11 bin/shared/private/libtevent-private-samba.so(_tevent_loop_once+0x101) [0x7398e82f1861]
   #12 /data/samba/samba01/bin/winbindd(main+0x1b61) [0x58ce3a307ff1]
   #13 /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x7398e662a1ca]
   #14 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x7398e662a28b]
   #15 /data/samba/samba01/bin/winbindd(_start+0x25) [0x58ce3a27d945]
Comment 1 Gary Lockyer 2025-10-24 02:27:24 UTC
Running make TESTS="samba4.rpc.lsa" test in a loop will trigger the crash.

It appears to be a race condition between.
source3/windbindd/winbindd_util.c terminate_child which
   kills the child process, and frees the child monitor_fde.

		kill(c->pid, SIGTERM);
		c->pid = 0;
		if (c->sock != -1) {
		close(c->sock);
		// }
		// c->sock = -1;
		// DBG_ERR("Freed c->monitor_fde (%p), pid (%d)\n",
		// 	c->monitor_fde, c->pid);
		// TALLOC_FREE(c->monitor_fde);



and

lib/tevent/tevent_epoll.c epoll_event_loop line 632
               struct tevent_fd *fde = talloc_get_type(events[i].data.ptr,
						       struct tevent_fd);

The kill makes the child socked readable as the child process has gone away.







The TALLOC_FREE(c->monitor_fde);
Comment 2 Gary Lockyer 2025-10-24 02:34:55 UTC
Sigh, lets try that againn :-)

Running make TESTS="samba4.rpc.lsa" test in a loop will trigger the crash.

It appears to be a race condition between.
source3/windbindd/winbindd_util.c terminate_child which
   kills the child process,
   and frees the child monitor_fde.

		kill(c->pid, SIGTERM);
		c->pid = 0;
		if (c->sock != -1) {
		        close(c->sock);
		}
		c->sock = -1;
		TALLOC_FREE(c->monitor_fde);



and

lib/tevent/tevent_epoll.c epoll_event_loop line 632
               struct tevent_fd *fde = talloc_get_type(events[i].data.ptr,
						       struct tevent_fd);

The kill makes the child socked readable as the child process has gone away, which has:

source3/windbindd/winbindd_dual.c child_socket_readable registered

events[i].data.ptr points to c->monitor_fde
Comment 3 Gary Lockyer 2025-11-10 19:21:38 UTC
Except this code is all synchronous, and the talloc destructor removes the FD from the epoll list.
Comment 4 Gary Lockyer 2025-11-10 19:23:57 UTC
But the the epoll_wait is returning an event that points to freed memory.
Need to find out where that's coming from.
Comment 5 Stefan Metzmacher 2025-11-10 19:40:04 UTC
https://gitlab.com/samba-team/samba/-/merge_requests/4283 also has some details...
Comment 6 Stefan Metzmacher 2025-11-10 19:45:51 UTC
Gary, what OS and kernel is this on?
Comment 7 Gary Lockyer 2025-11-10 20:19:55 UTC
6.14.0-35-generic #35~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Oct 14 13:55:17 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Comment 8 Gary Lockyer 2025-11-10 22:37:54 UTC
But I did see a failure in CI, which is what started me down the rabbit hole.
Comment 9 Andreas Schneider 2025-12-09 10:49:42 UTC
I see pretty often samba-ad-dc-4b failing with WBC_ERR_WINBIND_NOT_AVAILABLE:

https://gitlab.com/samba-team/devel/samba/-/jobs/12368740677
Comment 10 Douglas Bagnall 2025-12-09 21:31:08 UTC
(In reply to Andreas Schneider from comment #9)
In case this bug lasts longer than the CI log, here are some relevant lines

> Pulling docker image registry.gitlab.com/samba-team/devel/samba/samba-ci-ubuntu2204:336927a79f09b3eb729c64872bf4eca3e2f6761f
> Linux runner-xs6vzpvoq-project-6378020-concurrent-0 5.15.154+ #1 SMP Sat May 4 12:14:42 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

> ==> /builds/samba-team/devel/samba/samba-ad-dc-4b.stdout <==
> [149(913)/192 at 3m41s] samba4.blackbox.kinit_trust(fl2008r2dc:local)
> 
> ==> /builds/samba-team/devel/samba/samba-ad-dc-4b.stderr <==
> 2025-12-09T09:53:01.701228+00:00 dc7.samba2008r2.example.com samba[574]: winbindd daemon died with exit status 6
> 2025-12-09T09:53:01.701276+00:00 dc7.samba2008r2.example.com samba[574]: task_server_terminate: task_server_terminate: [winbindd child process exited]
> 2025-12-09T09:53:01.702712+00:00 dc7.samba2008r2.example.com samba[559]: samba_terminate: samba_terminate of samba 559: winbindd child process exited
> 
> ==> /builds/samba-team/devel/samba/samba-ad-dc-4b.stdout <==
> UNEXPECTED(failure): samba4.blackbox.kinit_trust.wbinfo check outgoing trust pw(fl2008r2dc:local)
> REASON: Exception: Exception: failed to call wbcCheckTrustCredentials: WBC_ERR_WINBIND_NOT_AVAILABLE
Comment 11 Gary Lockyer 2026-01-06 21:47:16 UTC
Created attachment 18792 [details]
Proposed fix (version 1)
Comment 12 Gary Lockyer 2026-01-06 21:48:35 UTC
It looks to be a race condition between the child socket closing, and it being de-registered from epoll.
Comment 13 Gary Lockyer 2026-01-12 19:26:36 UTC
Created attachment 18796 [details]
Proposed fix (version 2)

Updated the commit title
Comment 14 Stefan Metzmacher 2026-01-13 10:47:39 UTC
(In reply to Gary Lockyer from comment #13)

Looks good, but why did you close the merge request and didn't push
the updated patch there?
Comment 15 Stefan Metzmacher 2026-01-13 10:49:32 UTC
(In reply to Stefan Metzmacher from comment #14)

Ok, looked at the wrong MR
Comment 16 Samba QA Contact 2026-01-13 14:51:04 UTC
This bug was referenced in samba master:

a3684a2284cdf421090d6064b720b81b05b6eae6
Comment 17 Douglas Bagnall 2026-01-14 02:19:44 UTC
Created attachment 18798 [details]
patch for 4.23

Backport to 4.23 is trivial; beyond that looks tricky.
Comment 18 Douglas Bagnall 2026-01-14 20:34:26 UTC
For 4.23.
Comment 19 Samba QA Contact 2026-01-15 13:39:11 UTC
This bug was referenced in samba v4-23-test:

36f0300cda5989c948801fba0f8b0b64066f54a9
Comment 20 Samba QA Contact 2026-01-23 11:39:36 UTC
This bug was referenced in samba v4-23-stable (Release samba-4.23.5):

36f0300cda5989c948801fba0f8b0b64066f54a9