Bug 14283 - panic when trying to list directory with msdfs link
Summary: panic when trying to list directory with msdfs link
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.10.13
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
Depends on:
Reported: 2020-02-19 17:24 UTC by cwseys
Modified: 2020-02-24 08:17 UTC (History)
1 user (show)

See Also:

git-am fix for 4.10.next, 4.11.next. (1.85 KB, patch)
2020-02-19 21:23 UTC, Jeremy Allison
slow: review+

Note You need to log in before you can comment on or make changes to this bug.
Description cwseys 2020-02-19 17:24:52 UTC
When trying to list the contents of a directory containing a msdfs link in 4.10.13, samba starts panicking and does not list the directory.

Using the Van Belle package for Debian Buster.

# create an msdfs link in a directory served by Samba.
# initial setup, /srv/smb is share [smb]
mkdir -p /srv/smb/test/goal
touch /srv/smb/test/goal/you_made_it
# fix up permissions
cd /srv/smb/test
# link to the directory 'goal' on the same server
ln -s msdfs://smb09.physics.wisc.edu/smb/test/goal goallink

# On a different computer, browsing the server (smb09) works
# until one tries to list the 'test' directory.
# then panics occur.  Tested from Mac Catalina and cifs.ko 4.19
# with same result.

[2020/02/19 08:19:23.788284,  0] ../../lib/util/fault.c:79(fault_report)
[2020/02/19 08:19:23.788472,  0] ../../lib/util/fault.c:80(fault_report)
  INTERNAL ERROR: Signal 11 in pid 32606 (4.10.13-Debian)
  If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
[2020/02/19 08:19:23.788546,  0] ../../lib/util/fault.c:86(fault_report)
[2020/02/19 08:19:23.788594,  0] ../../source3/lib/util.c:824(smb_panic_s3)
  PANIC (pid 32606): internal error
[2020/02/19 08:19:23.789923,  0] ../../lib/util/fault.c:265(log_stack_trace)
  BACKTRACE: 34 stack frames:
   #0 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(log_stack_trace+0x30) [0x7fdd44919030]
   #1 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(smb_panic_s3+0x18) [0x7fdd448be9c8]
   #2 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(smb_panic+0x2d) [0x7fdd4491912d]
   #3 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(+0x1438d) [0x7fdd4491938d]
   #4 /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730) [0x7fdd42d72730]
   #5 /usr/lib/x86_64-linux-gnu/samba/vfs/full_audit.so(+0x6d18) [0x7fdd3db85d18]
   #6 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1cd59e) [0x7fdd44b5659e]
   #7 /usr/lib/x86_64-linux-gnu/libtalloc.so.2(talloc_free_children+0x67f) [0x7fdd42ff891f]
   #8 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(+0x20750) [0x7fdd44925750]
   #9 /usr/lib/x86_64-linux-gnu/libtalloc.so.2(_talloc_free+0x384) [0x7fdd42ff0f84]
   #10 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(get_referred_path+0x1dd) [0x7fdd44b57f8d]
   #11 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0xa653e) [0x7fdd44a2f53e]
   #12 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(setup_dfs_referral+0x99) [0x7fdd44b58559]
   #13 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smb2_ioctl_dfs+0x223) [0x7fdd44b6f6d3]
   #14 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_process_ioctl+0x343) [0x7fdd44b6f1c3]
   #15 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_dispatch+0xcb1) [0x7fdd44b60b31]
   #16 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1d88a0) [0x7fdd44b618a0]
   #17 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_invoke_fd_handler+0x7f) [0x7fdd42d966af]
   #18 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xda4f) [0x7fdd42d9ca4f]
   #19 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xbab7) [0x7fdd42d9aab7]
   #20 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x91) [0x7fdd42d95dd1]
   #21 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fdd42d9605b]
   #22 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xba57) [0x7fdd42d9aa57]
   #23 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_process+0x76f) [0x7fdd44b50b0f]
   #24 /usr/sbin/smbd(+0xbf00) [0x561ecfa98f00]
   #25 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_invoke_fd_handler+0x7f) [0x7fdd42d966af]
   #26 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xda4f) [0x7fdd42d9ca4f]
   #27 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xbab7) [0x7fdd42d9aab7]
   #28 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x91) [0x7fdd42d95dd1]
   #29 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fdd42d9605b]
   #30 /usr/lib/x86_64-linux-gnu/libtevent.so.0(+0xba57) [0x7fdd42d9aa57]
   #31 /usr/sbin/smbd(main+0x1d66) [0x561ecfa95dd6]
   #32 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fdd42bc309b]
   #33 /usr/sbin/smbd(_start+0x2a) [0x561ecfa95f8a]
[2020/02/19 08:19:23.790660,  0] ../../source3/lib/util.c:836(smb_panic_s3)
  smb_panic(): calling panic action [/usr/share/samba/panic-action 32606]
mail: cannot stat `/smb/home/cwseys/.mail': Permission denied
mail: Cannot open `/smb/home/cwseys/.mailrc': Permission denied
[2020/02/19 08:19:24.221700,  0] ../../source3/lib/util.c:844(smb_panic_s3)
  smb_panic(): action returned status 0
[2020/02/19 08:19:24.223265,  0] ../../source3/lib/dumpcore.c:315(dump_core)
  dumping core in /var/log/samba/cores/smbd
[2020/02/19 08:19:24.296148,  1] ../../source3/smbd/server.c:917(remove_child_pid)
  Scheduled cleanup of brl and lock database after unclean shutdown
[2020/02/19 08:19:44.316053,  1] ../../source3/smbd/smbd_cleanupd.c:99(smbd_cleanupd_unlock)
  smbd_cleanupd_unlock: Cleaning up brl and lock database after unclean shutdown

# Global parameters
        dns proxy = No
        hostname lookups = Yes   
        kerberos method = secrets and keytab
        logging = file
        max log size = 100000
        ntlm auth = disabled
        panic action = /usr/share/samba/panic-action %d
        realm = PHYSICS.WISC.EDU
        security = USER
        server signing = required
        server string = %h server
        workgroup = PHYSICS
        fruit:nfs_aces = no
        idmap config * : backend = tdb
        postexec = /usr/bin/logger -t smbd 'disconnect of %S on %L by %u at %m server process %d'
        preexec = /usr/bin/logger -t smbd 'connecting to %S on %L by %u at %m os %a using %R ip %I server process %d'
        vfs objects = dfs_samba4

        delete veto files = Yes
        inherit acls = Yes
        inherit permissions = Yes
        msdfs root = Yes
        path = /srv/smb
        read only = No
        smb encrypt = desired
        veto files = /.DS_Store/._.DS_Store/
        vfs objects = full_audit catia fruit streams_xattr
        fruit:veto_appledouble = no
        fruit:encoding = native
        full_audit:failure = connect
        full_audit:success = disconnect open rename rmdir unlink
        full_audit:prefix = %u|%D|%I|%a|%R|%d

Thanks for you work!
Comment 1 Jeremy Allison 2020-02-19 17:33:32 UTC
Can you get me a backtrace place.

Install the symbols package, then add this line to the [global] section of your smb.conf:

panic action = /bin/sleep 9999999

Then reproduce the crash, the crashed smbd will be waiting for the sleep to finish.

Attach to the crashed process using:

gdb /usr/sbin/smbd
(gdb) attach <pid>

where <pid> is the crashed process.

Then type

(gdb) bt

to get a full stack backtrace. Thanks !
Comment 2 cwseys 2020-02-19 19:41:55 UTC
Looks like it was caused by full_audit vfs.  Removing it from the config eliminated the backtrace.
(gdb) bt
#0  0x00007fdd42c65437 in __GI___waitpid (pid=391, stat_loc=stat_loc@entry=0x7ffd50633008, options=options@entry=0)
    at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x00007fdd42be35df in do_system (line=line@entry=0x561ed0c7e880 "/bin/sleep 9999999")
    at ../sysdeps/posix/system.c:149
#2  0x00007fdd42be39ca in __libc_system (line=line@entry=0x561ed0c7e880 "/bin/sleep 9999999")
    at ../sysdeps/posix/system.c:185
#3  0x00007fdd448bea27 in smb_panic_s3 (why=<optimized out>) at ../../source3/lib/util.c:837
#4  0x00007fdd4491912d in smb_panic (why=why@entry=0x7fdd4492dfe6 "internal error") at ../../lib/util/fault.c:174
#5  0x00007fdd4491938d in fault_report (sig=11) at ../../lib/util/fault.c:88
#6  sig_fault (sig=11) at ../../lib/util/fault.c:99
#7  <signal handler called>
#8  0x00007fdd3db78d18 in audit_prefix (conn=0x561ed0c24100, ctx=0x561ed0c66c70)
    at ../../source3/modules/vfs_full_audit.c:488
#9  do_log (op=SMB_VFS_OP_DISCONNECT, success=<optimized out>, handle=<optimized out>, format=<optimized out>)
    at ../../source3/modules/vfs_full_audit.c:633
#10 0x00007fdd44b5659e in conn_struct_tos_destructor (c=0x561ed0c66f50) at ../../source3/smbd/msdfs.c:386
#11 0x00007fdd42ff891f in talloc_free_children () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#12 0x00007fdd44925750 in talloc_pop (frame=0x561ed0c66c70) at ../../lib/util/talloc_stack.c:125
#13 0x00007fdd42ff0f84 in _talloc_free () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#14 0x00007fdd44b57f8d in get_referred_path (ctx=ctx@entry=0x561ed0c66980,
    dfs_path=dfs_path@entry=0x561ed0c66b41 "/smb09.physics.wisc.edu/smb/cwseys_test/goallink",
    remote_address=<optimized out>, local_address=0x561ed0c60d70, allow_broken_path=<optimized out>,
    jucn=jucn@entry=0x561ed0c66be0, consumedcntp=0x7ffd50633b3c, self_referralp=0x7ffd50633b3b)
    at ../../source3/smbd/msdfs.c:1181
#15 0x00007fdd44a2f53e in vfswrap_get_dfs_referrals (handle=0x561ed0c64610, r=0x561ed0c66980)
    at ../../source3/modules/vfs_default.c:226
#16 0x00007fdd44b58559 in setup_dfs_referral (orig_conn=orig_conn@entry=0x561ed0c57c30,
    dfs_path=0x561ed0c667f0 "//smb09.physics.wisc.edu/smb/cwseys_test/goallink",
    max_referral_level=max_referral_level@entry=3, ppdata=ppdata@entry=0x561ed0c88b40,
    pstatus=pstatus@entry=0x7ffd50633d68) at ../../source3/smbd/msdfs.c:1218
#17 0x00007fdd44b206f3 in call_trans2getdfsreferral (pparams=0x561ed0c88b30, total_data=<optimized out>,
    max_data_bytes=4000, ppdata=0x561ed0c88b40, total_params=102, req=0x561ed0c66650, conn=0x561ed0c57c30)
    at ../../source3/smbd/trans2.c:9266
#18 handle_trans2 (conn=conn@entry=0x561ed0c57c30, req=req@entry=0x561ed0c66650, state=state@entry=0x561ed0c88ad0)
    at ../../source3/smbd/trans2.c:9543
#19 0x00007fdd44b23210 in reply_trans2 (req=0x561ed0c66650) at ../../source3/smbd/trans2.c:9728
#20 0x00007fdd44b4c3cb in switch_message (type=<optimized out>, req=req@entry=0x561ed0c66650)
    at ../../source3/smbd/process.c:1716
#21 0x00007fdd44b4e6f1 in construct_reply (deferred_pcd=0x0, encrypted=<optimized out>, seqnum=24, unread_bytes=0,
    size=174, inbuf=<optimized out>, xconn=0x561ed0c58510) at ../../source3/smbd/process.c:1752
#22 process_smb (xconn=xconn@entry=0x561ed0c58510, inbuf=<optimized out>, nread=174, unread_bytes=0, seqnum=24,
    encrypted=<optimized out>, deferred_pcd=0x0) at ../../source3/smbd/process.c:2007
#23 0x00007fdd44b4f76d in smbd_server_connection_read_handler (xconn=0x561ed0c58510, fd=<optimized out>)
    at ../../source3/smbd/process.c:2607
#24 0x00007fdd42d966af in tevent_common_invoke_fd_handler (fde=fde@entry=0x561ed0c41110, flags=<optimized out>,
    removed=removed@entry=0x0) at ../../tevent_fd.c:138
#25 0x00007fdd42d9ca4f in epoll_event_loop (tvalp=0x7ffd50634050, epoll_ev=0x561ed0c53db0)
    at ../../tevent_epoll.c:736  
#26 epoll_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../../tevent_epoll.c:937
#27 0x00007fdd42d9aab7 in std_event_loop_once (ev=0x561ed0c23020,
    location=0x7fdd44c6edd8 "../../source3/smbd/process.c:4127") at ../../tevent_standard.c:110
#28 0x00007fdd42d95dd1 in _tevent_loop_once (ev=ev@entry=0x561ed0c23020,
    location=location@entry=0x7fdd44c6edd8 "../../source3/smbd/process.c:4127") at ../../tevent.c:772
#29 0x00007fdd42d9605b in tevent_common_loop_wait (ev=0x561ed0c23020,
    location=0x7fdd44c6edd8 "../../source3/smbd/process.c:4127") at ../../tevent.c:895
#30 0x00007fdd42d9aa57 in std_event_loop_wait (ev=0x561ed0c23020,
    location=0x7fdd44c6edd8 "../../source3/smbd/process.c:4127") at ../../tevent_standard.c:141
#31 0x00007fdd44b50b0f in smbd_process (ev_ctx=0x561ed0c23020, msg_ctx=<optimized out>, sock_fd=41,
    interactive=<optimized out>) at ../../source3/smbd/process.c:4127
#32 0x0000561ecfa98f00 in smbd_accept_connection (ev=0x561ed0c23020, fde=<optimized out>, flags=<optimized out>,
    private_data=<optimized out>) at ../../source3/smbd/server.c:1067
#33 0x00007fdd42d966af in tevent_common_invoke_fd_handler (fde=fde@entry=0x561ed0c55b70, flags=<optimized out>,
    removed=removed@entry=0x0) at ../../tevent_fd.c:138
#34 0x00007fdd42d9ca4f in epoll_event_loop (tvalp=0x7ffd506342c0, epoll_ev=0x561ed0c367a0)
    at ../../tevent_epoll.c:736  
#35 epoll_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../../tevent_epoll.c:937
#36 0x00007fdd42d9aab7 in std_event_loop_once (ev=0x561ed0c23020,
    location=0x561ecfa9f788 "../../source3/smbd/server.c:1423") at ../../tevent_standard.c:110
#37 0x00007fdd42d95dd1 in _tevent_loop_once (ev=ev@entry=0x561ed0c23020,
    location=location@entry=0x561ecfa9f788 "../../source3/smbd/server.c:1423") at ../../tevent.c:772
#38 0x00007fdd42d9605b in tevent_common_loop_wait (ev=0x561ed0c23020,
    location=0x561ecfa9f788 "../../source3/smbd/server.c:1423") at ../../tevent.c:895
#39 0x00007fdd42d9aa57 in std_event_loop_wait (ev=0x561ed0c23020,
    location=0x561ecfa9f788 "../../source3/smbd/server.c:1423") at ../../tevent_standard.c:141
#40 0x0000561ecfa95dd6 in smbd_parent_loop (parent=0x561ed0c37530, ev_ctx=0x561ed0c23020)
    at ../../source3/smbd/server.c:1423
Comment 3 Jeremy Allison 2020-02-19 20:03:46 UTC
Interesting, looks like a talloc hierarchy problem. Some of this is allocated off talloc_tos(), but I can't immediately see the problem.
Comment 4 Jeremy Allison 2020-02-19 20:07:12 UTC
I think it's hitting a null pointer here:

 488         result = talloc_sub_full(ctx,
 489                         lp_servicename(talloc_tos(), SNUM(conn)),
 490                         conn->session_info->unix_info->unix_name,
 491                         conn->connectpath,
 492                         conn->session_info->unix_token->gid,
 493                         conn->session_info->unix_info->sanitized_username,
 494                         conn->session_info->info->domain_name,
 495                         prefix);

inside audit_prefix(). It's a temporary conn pointer, so I wonder if it's still valid at the teardown point.
Comment 5 Jeremy Allison 2020-02-19 20:13:45 UTC
Yep, inside: msdfs.c:create_conn_struct_as_root() we have:

 302         if (session_info != NULL) {
 303                 conn->session_info = copy_session_info(conn, session_info);
 304                 if (conn->session_info == NULL) {
 305                         DEBUG(0, ("copy_serverinfo failed\n"));
 306                         TALLOC_FREE(conn);
 307                         return NT_STATUS_NO_MEMORY;
 308                 }
 309                 /* unix_info could be NULL in session_info */
 310                 if (conn->session_info->unix_info != NULL) {
 311                         vfs_user = conn->session_info->unix_info->unix_name;
 312                 } else {
 313                         vfs_user = get_current_username();
 314                 }

so the copied conn->session_info->unix_info can be NULL, and it's not protected inside the audit_prefix() code in vfs_full_audit.c.
Comment 6 Jeremy Allison 2020-02-19 20:14:15 UTC
So this is actually a vfs_full_audit.c bug, not an MSDFS one.
Comment 7 Jeremy Allison 2020-02-19 20:16:09 UTC
Yep, get_referred_path() calls:

1110         status = create_conn_struct_tos_cwd(global_messaging_context(),
1111                                             snum,
1112                                             lp_path(frame, snum),
1113                                             NULL,
1114                                             &c);

- note it's passing NULL for the 'const struct auth_session_info *session_info' argument.
Comment 8 Ralph Böhme 2020-02-19 20:17:58 UTC
(In reply to Jeremy Allison from comment #7)
conn->session_info can be NULL for conns created via create_conn_struct_tos_cwd() and friends in Samba versions before 4.12. This should be fixed in 4.12 by the patchset leading up to 1fa05e5c76e. Changes are too complex for a backport.
Comment 9 Ralph Böhme 2020-02-19 20:23:59 UTC
(In reply to Ralph Böhme from comment #8)
Thinking about it, we may want to add a check for conn->session_info != NULL to full_audit as a fix for older verions... :)
Comment 10 Jeremy Allison 2020-02-19 20:42:14 UTC
(In reply to Ralph Böhme from comment #9)

Yeah, that was the fix I was going for for older versions :-).
Comment 11 Jeremy Allison 2020-02-19 21:23:44 UTC
Created attachment 15801 [details]
git-am fix for 4.10.next, 4.11.next.

cwseys@physics.wisc.edu, can you test this patch. Should fix your problem.

Thanks !

Comment 12 Ralph Böhme 2020-02-20 10:27:46 UTC
Reassiging to Karolin for inclusion in 4.10 and 4.11 (NOT in 4.12).
Comment 13 Karolin Seeger 2020-02-20 10:29:57 UTC
(In reply to Ralph Böhme from comment #12)
Pushed to autobuild-v4-{10,11}-test.
Comment 14 cwseys 2020-02-20 21:44:25 UTC
(In reply to Jeremy Allison from comment #11)
Thanks for the super fast bug fix!  I don't have time to recompile at the moment.  You all seem to be fairly confident. :)  Definitely will test when 4.10.14 comes out though. :)
Comment 15 Jeremy Allison 2020-02-20 22:05:08 UTC
Well I'd be happier if you could test it, but I'm pretty confident this is it, yeah :-).

It also doesn't affect 4.12 for which Ralph already fixed things properly.
Comment 16 Karolin Seeger 2020-02-24 08:17:25 UTC
(In reply to Karolin Seeger from comment #13)
Pushed to both branches.
Closing out bug report.