Bug 14809 - Shares with variable substitutions cause core dump upon connection from MacOS Big Sur 11.5.2
Summary: Shares with variable substitutions cause core dump upon connection from MacOS...
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.14.6
Hardware: All Linux
: P5 normal (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-23 20:18 UTC by Stewart A.
Modified: 2021-09-07 11:52 UTC (History)
1 user (show)

See Also:


Attachments
ZSTD-compressed core dump from smbd 4.16.4 in Arch container (901.35 KB, application/zstd)
2021-08-23 20:18 UTC, Stewart A.
no flags Details
git-am fix for master. (1.90 KB, patch)
2021-08-24 00:47 UTC, Jeremy Allison
no flags Details
git-am fix for 4.15.rcnext, 4.14.next. (2.33 KB, patch)
2021-08-26 00:03 UTC, Jeremy Allison
slow: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stewart A. 2021-08-23 20:18:19 UTC
Created attachment 16748 [details]
ZSTD-compressed core dump from smbd 4.16.4 in Arch container

I can reproduce a core dump upon new authenticated connections from a MacOS Big Sur 11.5.2 client to a Fedora 34 or Arch host with Samba 4.16.4. Repro still exists with F35 (rawhide) and 4.15.0rc2.

The issue only appears when variable substitutions are used in the share paths.

I was not able to repro on F33 nor Ubuntu 20.04, both using older 4.13.x samba versions.

In my case, the crash disrupts the ability to configure a Time Machine destination or perform a backup to the Samba host, and causes dead share mounts on the MacOS client due to connection resets as smbd restarts itself after its core dump.


Stack trace in server logs:

Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181474,  0] ../../source3/smbd/msdfs.c:360(create_conn_struct_as_root)
Aug 23 11:23:53 localhost smbd[1516824]:   create_conn_struct_as_root: Failed to canonicalize sharepath
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181704,  0] ../../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
Aug 23 11:23:53 localhost smbd[1516824]:   Bad talloc magic value - unknown value
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181745,  0] ../../lib/util/fault.c:172(smb_panic_log)
Aug 23 11:23:53 localhost smbd[1516824]:   ===============================================================
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181779,  0] ../../lib/util/fault.c:173(smb_panic_log)
Aug 23 11:23:53 localhost smbd[1516824]:   INTERNAL ERROR: Bad talloc magic value - unknown value in pid 1516824 (4.14.6)
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181813,  0] ../../lib/util/fault.c:177(smb_panic_log)
Aug 23 11:23:53 localhost smbd[1516824]:   If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181846,  0] ../../lib/util/fault.c:182(smb_panic_log)
Aug 23 11:23:53 localhost smbd[1516824]:   ===============================================================
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.181877,  0] ../../lib/util/fault.c:183(smb_panic_log)
Aug 23 11:23:53 localhost smbd[1516824]:   PANIC (pid 1516824): Bad talloc magic value - unknown value in 4.14.6
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.182621,  0] ../../lib/util/fault.c:287(log_stack_trace)
Aug 23 11:23:53 localhost smbd[1516824]:   BACKTRACE: 29 stack frames:
Aug 23 11:23:53 localhost smbd[1516824]:    #0 /lib64/libsamba-util.so.0(log_stack_trace+0x34) [0x7feac1123804]
Aug 23 11:23:53 localhost smbd[1516824]:    #1 /lib64/libsamba-util.so.0(smb_panic+0xd) [0x7feac1123a5d]
Aug 23 11:23:53 localhost smbd[1516824]:    #2 /lib64/libtalloc.so.2(+0x3712) [0x7feac0a37712]
Aug 23 11:23:53 localhost smbd[1516824]:    #3 /usr/lib64/samba/libsmbd-base-samba4.so(create_conn_struct_cwd+0x89) [0x7feac0eeb9f9]
Aug 23 11:23:53 localhost smbd[1516824]:    #4 /usr/lib64/samba/libsmbd-base-samba4.so(mds_init_ctx+0x1c8) [0x7feac0f6c0b8]
Aug 23 11:23:53 localhost smbd[1516824]:    #5 /usr/lib64/samba/libsmbd-base-samba4.so(_mdssvc_open+0x11d) [0x7feac0f6cebd]
Aug 23 11:23:53 localhost smbd[1516824]:    #6 /usr/lib64/samba/libsmbd-base-samba4.so(+0x1d8b3f) [0x7feac0f6db3f]
Aug 23 11:23:53 localhost smbd[1516824]:    #7 /lib64/libdcerpc-server-core.so.0(+0xafa8) [0x7feac0cd6fa8]
Aug 23 11:23:53 localhost smbd[1516824]:    #8 /lib64/libdcerpc-binding.so.0(+0x1167f) [0x7feac0a7067f]
Aug 23 11:23:53 localhost smbd[1516824]:    #9 /usr/lib64/samba/libsamba-sockets-samba4.so(+0xe04b) [0x7feabfeb004b]
Aug 23 11:23:53 localhost smbd[1516824]:    #10 /usr/lib64/samba/libsamba-sockets-samba4.so(+0x652e) [0x7feabfea852e]
Aug 23 11:23:53 localhost smbd[1516824]:    #11 /lib64/libtevent.so.0(tevent_common_invoke_immediate_handler+0x192) [0x7feac0a19742]
Aug 23 11:23:53 localhost smbd[1516824]:    #12 /lib64/libtevent.so.0(tevent_common_loop_immediate+0x1e) [0x7feac0a1976e]
Aug 23 11:23:53 localhost smbd[1516824]:    #13 /lib64/libtevent.so.0(+0xe000) [0x7feac0a1d000]
Aug 23 11:23:53 localhost smbd[1516824]:    #14 /lib64/libtevent.so.0(+0x669b) [0x7feac0a1569b]
Aug 23 11:23:53 localhost smbd[1516824]:    #15 /lib64/libtevent.so.0(_tevent_loop_once+0x98) [0x7feac0a17da8]
Aug 23 11:23:53 localhost smbd[1516824]:    #16 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7feac0a17e9b]
Aug 23 11:23:53 localhost smbd[1516824]:    #17 /lib64/libtevent.so.0(+0x670b) [0x7feac0a1570b]
Aug 23 11:23:53 localhost smbd[1516824]:    #18 /usr/lib64/samba/libsmbd-base-samba4.so(smbd_process+0x840) [0x7feac0ee6920]
Aug 23 11:23:53 localhost smbd[1516824]:    #19 /usr/sbin/smbd(+0xcbcd) [0x55796c65dbcd]
Aug 23 11:23:53 localhost smbd[1516824]:    #20 /lib64/libtevent.so.0(tevent_common_invoke_fd_handler+0x95) [0x7feac0a194f5]
Aug 23 11:23:53 localhost smbd[1516824]:    #21 /lib64/libtevent.so.0(+0xe21f) [0x7feac0a1d21f]
Aug 23 11:23:53 localhost smbd[1516824]:    #22 /lib64/libtevent.so.0(+0x669b) [0x7feac0a1569b]
Aug 23 11:23:53 localhost smbd[1516824]:    #23 /lib64/libtevent.so.0(_tevent_loop_once+0x98) [0x7feac0a17da8]
Aug 23 11:23:53 localhost smbd[1516824]:    #24 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7feac0a17e9b]
Aug 23 11:23:53 localhost smbd[1516824]:    #25 /lib64/libtevent.so.0(+0x670b) [0x7feac0a1570b]
Aug 23 11:23:53 localhost smbd[1516824]:    #26 /usr/sbin/smbd(main+0x1e1d) [0x55796c65a79d]
Aug 23 11:23:53 localhost smbd[1516824]:    #27 /lib64/libc.so.6(__libc_start_main+0xd5) [0x7feac070ab75]
Aug 23 11:23:53 localhost smbd[1516824]:    #28 /usr/sbin/smbd(_start+0x2e) [0x55796c65a8fe]
Aug 23 11:23:53 localhost smbd[1516824]: [2021/08/23 11:23:53.183093,  0] ../../source3/lib/dumpcore.c:317(dump_core)
Aug 23 11:23:53 localhost smbd[1516824]:   coredump is handled by helper binary specified at /proc/sys/kernel/core_pattern




Example share configurations that trigger the issue:

# core dump upon connection from MacOS Big Sur
[homes]
  comment = Networked home for %u
  path = %H
  writable = yes
  browsable = no
  read only = no
  map archive = yes

# core dump upon connection from MacOS Big Sur
# works fine if I replace '%u' for username literal
[backups-timemachine]
  comment = Time Machine Backups
  path = /data/backups/%u/timemachine
  browsable = yes
  writable = yes
  valid users = @shared
  create mask = 0600
  directory mask = 0700
  spotlight = no

  vfs objects = acl_xattr catia fruit streams_xattr
  fruit:time machine = yes
  fruit:time machine max size = 1T
  fruit:nfs_aces = no


Example share configuration without substitutions that works as expected:

[shared]
  comment = Shared files
  path = /data/shared
  browsable = yes
  writable = yes


Repro steps:

# On Linux host
docker run --name f34 -it -p 4450:445 --entrypoint=/bin/bash fedora:34
dnf install -y samba
useradd foo
smbpasswd -a foo
cat << EOF >> /etc/samba/smb.conf
[backups]
	comment = User Data Directories
	path = /data/backups/%u
	browseable = Yes
	read only = No
	inherit acls = Yes
EOF
mkdir -p /data/backups/foo
/usr/sbin/smbd --foreground --no-process-group

# On MacOS
ssh linux-host -L 4450:localhost:4450
<connect to localhost:4450 as Foo and mount the 'backup' share>


I've attached a core dump from a test similar to above with an Arch docker container.
Comment 1 Jeremy Allison 2021-08-23 21:18:36 UTC
Can you add a line:

panic action = /bin/sleep 999999

to the [global] section of your smb.conf, and then reproduce the problem, attach to the parent of the sleep process with gdb and do "bt" to get a proper stack backtrace please ? That should help us track this down.
Comment 2 Stewart A. 2021-08-23 22:58:15 UTC
Done! Here's the dump after installing the relevant debuginfo packages on f34:

#0  0x00007ff16d22aaca in wait4 () from /lib64/libc.so.6
#1  0x00007ff16d1a809b in do_system () from /lib64/libc.so.6
#2  0x00007ff16d7b7faf in smb_panic_s3 (why=<optimized out>) at ../../source3/lib/util.c:840
#3  0x00007ff16db9ea6e in smb_panic (why=0x7ff16d4b9070 "Bad talloc magic value - unknown value") at ../../lib/util/fault.c:197
#4  0x00007ff16d4b2712 in _talloc_free.cold () from /lib64/libtalloc.so.2
#5  0x00007ff16d9669f9 in create_conn_struct_cwd (mem_ctx=0x55d9be556d00, ev=0x55d9be50bc60, msg=0x55d9be5103d0, session_info=0x55d9be54b860, snum=<optimized out>, path=0x55d9be560f80 "/data/backups/%u", 
    c=0x55d9be556d90) at ../../source3/smbd/msdfs.c:529
#6  0x00007ff16d9e70b8 in mds_init_ctx (mem_ctx=mem_ctx@entry=0x55d9be547f90, ev=0x55d9be50bc60, msg_ctx=msg_ctx@entry=0x55d9be5103d0, session_info=session_info@entry=0x55d9be54b860, snum=snum@entry=1, 
    sharename=sharename@entry=0x55d9be546350 "data", path=0x55d9be560f00 "/data/backups/%u") at ../../source3/rpc_server/mdssvc/mdssvc.c:1680
#7  0x00007ff16d9e7ebd in create_mdssvc_policy_handle (handle=0x55d9be509180, path=0x55d9be560f00 "/data/backups/%u", sharename=<optimized out>, snum=1, p=0x55d9be556300, mem_ctx=0x55d9be547f90)
    at ../../source3/rpc_server/mdssvc/srv_mdssvc_nt.c:97
#8  _mdssvc_open (p=0x55d9be556300, r=0x55d9be546220) at ../../source3/rpc_server/mdssvc/srv_mdssvc_nt.c:147
#9  0x00007ff16d9e8b3f in mdssvc__op_dispatch_internal (dce_call=0x55d9be547f90, mem_ctx=<optimized out>, r=0x55d9be546220, dispatch=S3COMPAT_RPC_DISPATCH_EXTERNAL) at ./librpc/gen_ndr/ndr_mdssvc_scompat.c:120
#10 0x00007ff16d751fa8 in dcesrv_request (call=0x55d9be547f90) at ../../librpc/rpc/dcesrv_core.c:1895
#11 dcesrv_process_ncacn_packet (blob=..., pkt=<optimized out>, dce_conn=0x55d9be53d6a0) at ../../librpc/rpc/dcesrv_core.c:2291
#12 dcesrv_read_fragment_done (subreq=<optimized out>) at ../../librpc/rpc/dcesrv_core.c:2832
#13 0x00007ff16d4eb67f in dcerpc_read_ncacn_packet_done (subreq=<optimized out>) at ../../librpc/rpc/dcerpc_util.c:967
#14 0x00007ff16c92b04b in tstream_readv_pdu_readv_done (subreq=0x55d9be55c9a0) at ../../lib/tsocket/tsocket_helpers.c:319
#15 0x00007ff16c92352e in tstream_readv_done (subreq=<optimized out>) at ../../lib/tsocket/tsocket.c:604
#16 0x00007ff16d494742 in tevent_common_invoke_immediate_handler () from /lib64/libtevent.so.0
#17 0x00007ff16d49476e in tevent_common_loop_immediate () from /lib64/libtevent.so.0
#18 0x00007ff16d498000 in epoll_event_loop_once () from /lib64/libtevent.so.0
#19 0x00007ff16d49069b in std_event_loop_once () from /lib64/libtevent.so.0
#20 0x00007ff16d492da8 in _tevent_loop_once () from /lib64/libtevent.so.0
#21 0x00007ff16d492e9b in tevent_common_loop_wait () from /lib64/libtevent.so.0
#22 0x00007ff16d49070b in std_event_loop_wait () from /lib64/libtevent.so.0
#23 0x00007ff16d961920 in smbd_process (ev_ctx=0x55d9be50bc60, msg_ctx=<optimized out>, dce_ctx=<optimized out>, sock_fd=52, interactive=<optimized out>) at ../../source3/smbd/process.c:4232
#24 0x000055d9bd523bcd in smbd_accept_connection (ev=0x55d9be50bc60, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../../source3/smbd/server.c:1020
#25 0x00007ff16d4944f5 in tevent_common_invoke_fd_handler () from /lib64/libtevent.so.0
#26 0x00007ff16d49821f in epoll_event_loop_once () from /lib64/libtevent.so.0
#27 0x00007ff16d49069b in std_event_loop_once () from /lib64/libtevent.so.0
#28 0x00007ff16d492da8 in _tevent_loop_once () from /lib64/libtevent.so.0
#29 0x00007ff16d492e9b in tevent_common_loop_wait () from /lib64/libtevent.so.0
#30 0x00007ff16d49070b in std_event_loop_wait () from /lib64/libtevent.so.0
#31 0x000055d9bd52079d in smbd_parent_loop (parent=0x55d9be523500, ev_ctx=0x55d9be50bc60) at ../../source3/smbd/server.c:1367
#32 main (argc=<optimized out>, argv=<optimized out>) at ../../source3/smbd/server.c:2220
Comment 3 Jeremy Allison 2021-08-24 00:30:26 UTC
Oh, I see the problem.

 509 NTSTATUS create_conn_struct_cwd(TALLOC_CTX *mem_ctx,
 510                                 struct tevent_context *ev,
 511                                 struct messaging_context *msg,
 512                                 const struct auth_session_info *session_info,
 513                                 int snum,
 514                                 const char *path,
 515                                 struct connection_struct **c)
 516 {
 517         NTSTATUS status;
 518 
 519         become_root();
 520         status = create_conn_struct_as_root(mem_ctx,
 521                                             ev,
 522                                             msg,
 523                                             c,
 524                                             snum,
 525                                             path,
 526                                             session_info);
 527         unbecome_root();
 528         if (!NT_STATUS_IS_OK(status)) {
 529                 TALLOC_FREE(c);
 530                 return status;
 531         }
 532 
 533         return NT_STATUS_OK;
 534 }

It's the TALLOC_FREE(c) on line 529 in the error path that is failing. Just remove that line.

In the error case, 'c' has not been assigned to, so it's currently pointing to the address of a pointer *within* a TALLOC'ed struct. That TALLOC_FREE(c) just shouldn't be there, it's a bug in the error case.
Comment 4 Jeremy Allison 2021-08-24 00:44:46 UTC
Ralph, this one is in your code I think. I have a fix, but I'd like you to look it over :-).
Comment 5 Jeremy Allison 2021-08-24 00:47:30 UTC
Created attachment 16749 [details]
git-am fix for master.

Passes "make test TESTS=samba.tests.blackbox.mdsearch".

I'm going to put in ci now.
Comment 6 Jeremy Allison 2021-08-24 03:42:08 UTC
ci passes. MR is:

https://gitlab.com/samba-team/samba/-/merge_requests/2125
Comment 7 Jeremy Allison 2021-08-24 19:54:54 UTC
Stewart - can you confirm this patch fixes the issue for you please ?
Comment 8 Samba QA Contact 2021-08-25 17:10:04 UTC
This bug was referenced in samba master:

b4d8c62c4e8191e05fd03dd096a0bc989e224ed3
857045f3a236dea125200dd09279d677e513682b
Comment 9 Stewart A. 2021-08-25 17:55:57 UTC
Rebuilt my OS package (samba-4.14.6-0.fc34.x86_64) yesterday with the patch and can confirm it fixes the core dumps, and Big Sur can now configure the networked backup destination :)

Thanks!
Comment 10 Jeremy Allison 2021-08-26 00:03:45 UTC
Created attachment 16753 [details]
git-am fix for 4.15.rcnext, 4.14.next.

Cherry-picked from master. Applies cleanly to 4.15.rcNext, 4.14.next.
Comment 11 Ralph Böhme 2021-09-05 13:48:26 UTC
Reassigning to Jule for inclusion in 4.14 and 4.15.
Comment 12 Jule Anger 2021-09-06 12:00:06 UTC
Pushed to autobuild-v4-{15,14}-test.
Comment 13 Samba QA Contact 2021-09-06 20:44:02 UTC
This bug was referenced in samba v4-15-test:

2ed234deee381cd15d7b7867136c5bbd78f5448c
57b266e23c459c8d0675ec17c8a5275f9c797781
Comment 14 Samba QA Contact 2021-09-07 08:43:45 UTC
This bug was referenced in samba v4-15-stable (Release samba-4.15.0rc5):

2ed234deee381cd15d7b7867136c5bbd78f5448c
57b266e23c459c8d0675ec17c8a5275f9c797781
Comment 15 Samba QA Contact 2021-09-07 11:13:29 UTC
This bug was referenced in samba v4-14-test:

b00fed3b698cc78a377d71e0574c878e262c4808
97dc8c0dcccbcecd3a8f8f3872b47d3a3c6e8036
Comment 16 Jule Anger 2021-09-07 11:52:19 UTC
Closing out bug report.

Thanks!