Bug 11375 - Panic in 4.2.2 around file_close_user > smbXsrv_session_logoff > smbXsrv_session_destructor
Panic in 4.2.2 around file_close_user > smbXsrv_session_logoff > smbXsrv_sess...
Status: RESOLVED DUPLICATE of bug 11394
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.2.2
All All
: P5 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks: 11394
  Show dependency treegraph
 
Reported: 2015-07-02 02:15 UTC by Nick Semenkovich
Modified: 2015-12-07 17:11 UTC (History)
2 users (show)

See Also:


Attachments
git-am test patch for master. (1.38 KB, patch)
2015-11-21 01:08 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Semenkovich 2015-07-02 02:15:38 UTC
Just saw this. It's rare (since 4.2.2 has been running with no crashes since its release).

Samba 4.2.2 (from git) on Ubuntu 15.10
All clients are Windows 8.1

==============================================================

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fe5acc6184a in __GI___waitpid (pid=1448, stat_loc=stat_loc@entry=0x7ffd7e121390, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
#0  0x00007fe5acc6184a in __GI___waitpid (pid=1448, stat_loc=stat_loc@entry=0x7ffd7e121390, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
        resultvar = 18446744073709551104
        oldtype = <optimized out>
        result = <optimized out>
#1  0x00007fe5acbdaffb in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
        __result = <optimized out>
        _buffer = {__routine = 0x7fe5acbdb2f0 <cancel_handler>, __arg = 0x7ffd7e12136c, __canceltype = 0, __prev = 0x0}
        _avail = 1
        status = 0
        save = <optimized out>
        pid = 1448
        sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {65536, 0 <repeats 15 times>}}, sa_flags = 0, sa_restorer = 0x7fe5b3aa0870}
        omask = {__val = {7296, 140624475828668, 140624525450416, 140624525450416, 140726718567648, 140624490929840, 140726718572608, 140624488632464, 140624490929840, 140726718572608, 0, 0, 0, 140624488716701, 1, 0}}
#2  0x00007fe5ae2d2662 in smb_panic_s3 (why=0x7fe5b06efaed "internal error") at ../source3/lib/util.c:801
        cmd = 0x7fe5b3aa0870 "/home/semenko/panic-action 1765"
        result = 0
        __FUNCTION__ = "smb_panic_s3"
#3  0x00007fe5b06e8f21 in smb_panic (why=0x7fe5b06efaed "internal error") at ../lib/util/fault.c:166
No locals.
#4  0x00007fe5b06e8bf9 in fault_report (sig=11) at ../lib/util/fault.c:83
        counter = 1
        __FUNCTION__ = "fault_report"
#5  0x00007fe5b06e8c0e in sig_fault (sig=11) at ../lib/util/fault.c:94
No locals.
#6  <signal handler called>
No locals.
#7  0x00007fe5b01ad596 in file_close_user (sconn=0x0, vuid=2494715235) at ../source3/smbd/files.c:250
        fsp = 0x7fe500000000
        next = 0x7fe5b439e3e0
#8  0x00007fe5b029e9bb in smbXsrv_session_logoff (session=0x7fe5b31f5660) at ../source3/smbd/smbXsrv_session.c:1583
        table = 0x7fe5b3f70dc0
        local_rec = 0x0
        global_rec = 0x0
        sconn = 0x0
        status = {v = 0}
        error = {v = 0}
        __FUNCTION__ = "smbXsrv_session_logoff"
#9  0x00007fe5b029d537 in smbXsrv_session_clear_and_logoff (session=0x7fe5b31f5660) at ../source3/smbd/smbXsrv_session.c:1118
        status = {v = 0}
        xconn = 0x0
#10 0x00007fe5b029d557 in smbXsrv_session_destructor (session=0x7fe5b31f5660) at ../source3/smbd/smbXsrv_session.c:1126
        status = {v = 3005175392}
        __FUNCTION__ = "smbXsrv_session_destructor"
#11 0x00007fe5afcd145c in _talloc_free_internal (ptr=0x7fe5b31f5660, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:993
        d = 0x7fe5b029d53f <smbXsrv_session_destructor>
        tc = 0x7fe5b31f5600
        ptr_to_free = 0x7fe5b3f36b80
#12 0x00007fe5afcd2593 in _talloc_free_children_internal (tc=0x7fe5b3f70d60, ptr=0x7fe5b3f70dc0, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:1472
        child = 0x7fe5b31f5660
        new_parent = 0x7fe5b2e25500
#13 0x00007fe5afcd160d in _talloc_free_internal (ptr=0x7fe5b3f70dc0, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:1019
        tc = 0x7fe5b3f70d60
        ptr_to_free = 0x7fe5b378d0a0
#14 0x00007fe5afcd2593 in _talloc_free_children_internal (tc=0x7fe5b3a82800, ptr=0x7fe5b3a82860, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:1472
        child = 0x7fe5b3f70dc0
        new_parent = 0x7fe5b2e25500
#15 0x00007fe5afcd160d in _talloc_free_internal (ptr=0x7fe5b3a82860, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:1019
        tc = 0x7fe5b3a82800
        ptr_to_free = 0x7fe5b3d206d0
#16 0x00007fe5afcd29a0 in _talloc_free (ptr=0x7fe5b3a82860, location=0x7fe5b040ea40 "../source3/smbd/server_exit.c:230") at ../lib/talloc/talloc.c:1594
        tc = 0x7fe5b3a82800
#17 0x00007fe5b02a68b0 in exit_server_common (how=SERVER_EXIT_NORMAL, reason=0x7fe5af0a1be3 "NT_STATUS_IO_TIMEOUT") at ../source3/smbd/server_exit.c:230
        client = 0x0
        xconn = 0x0
        sconn = 0x0
        msg_ctx = 0x7fe5b2e369b0
        __FUNCTION__ = "exit_server_common"
#18 0x00007fe5b02a69ee in smbd_exit_server_cleanly (explanation=0x7fe5af0a1be3 "NT_STATUS_IO_TIMEOUT") at ../source3/smbd/server_exit.c:263
No locals.
#19 0x00007fe5adc8de70 in exit_server_cleanly (reason=0x7fe5af0a1be3 "NT_STATUS_IO_TIMEOUT") at ../source3/lib/smbd_shim.c:131
No locals.
#20 0x00007fe5b0271f51 in smbd_server_connection_terminate_ex (xconn=0x7fe5b39e2070, reason=0x7fe5af0a1be3 "NT_STATUS_IO_TIMEOUT", location=0x7fe5b03fea38 "../source3/smbd/smb2_server.c:3498") at ../source3/smbd/smb2_server.c:1050
        __FUNCTION__ = "smbd_server_connection_terminate_ex"
#21 0x00007fe5b0279d30 in smbd_smb2_connection_handler (ev=0x7fe5b2e368c0, fde=0x7fe5b35d0f60, flags=1, private_data=0x7fe5b39e2070) at ../source3/smbd/smb2_server.c:3498
        xconn = 0x7fe5b39e2070
        status = {v = 3221225653}
#22 0x00007fe5ae2f2d26 in run_events_poll (ev=0x7fe5b2e368c0, pollrtn=1, pfds=0x7fe5b37ad7e0, num_pfds=5) at ../source3/lib/events.c:257
        pfd = 0x7fe5b37ad800
        flags = 1
        state = 0x7fe5b2e378d0
        pollfd_idx = 0x7fe5b3f597f0
        fde = 0x7fe5b35d0f60
        __FUNCTION__ = "run_events_poll"
#23 0x00007fe5ae2f2fb5 in s3_event_loop_once (ev=0x7fe5b2e368c0, location=0x7fe5b03f5ff0 "../source3/smbd/process.c:3992") at ../source3/lib/events.c:326
        state = 0x7fe5b2e378d0
        timeout = 60000
        num_pfds = 5
        ret = 1
        poll_errno = 0
#24 0x00007fe5af8be539 in _tevent_loop_once (ev=0x7fe5b2e368c0, location=0x7fe5b03f5ff0 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:533
        ret = 0
        nesting_stack_ptr = 0x0
#25 0x00007fe5af8be783 in tevent_common_loop_wait (ev=0x7fe5b2e368c0, location=0x7fe5b03f5ff0 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:637
        ret = 0
#26 0x00007fe5af8be84e in _tevent_loop_wait (ev=0x7fe5b2e368c0, location=0x7fe5b03f5ff0 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:656
No locals.
#27 0x00007fe5b025afe4 in smbd_process (ev_ctx=0x7fe5b2e368c0, msg_ctx=0x7fe5b2e369b0, sock_fd=47, interactive=false) at ../source3/smbd/process.c:3992
        trace_state = {frame = 0x7fe5b47e60e0, smbd_idle_profstamp = 0}
        client = 0x7fe5b3a82860
        sconn = 0x7fe5b3d20730
        xconn = 0x7fe5b39e2070
        locaddr = 0x7fe5b3bbc110 "\200\240\034\263\345\177"
        remaddr = 0x7fe5b40ee8e0 "/usr/local/samba/private/smbd.tmp/msg/msg.1765.1"
        ret = 32741
        status = {v = 0}
        __FUNCTION__ = "smbd_process"
#28 0x00007fe5b0d4716b in smbd_accept_connection (ev=0x7fe5b2e368c0, fde=0x7fe5b363cca0, flags=1, private_data=0x7fe5b35d0f60) at ../source3/smbd/server.c:627
        status = {v = 0}
        s = 0x0
        msg_ctx = 0x7fe5b2e369b0
        addr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 16 times>, "(\225\066\263\345\177\000\000\220\"\022~\375\177\000\000\020\"\022~\375\177\000\000\221Vn\260\345\177\000\000(\225\066\263\345\177\000\000\220\"\022~\375\177\000\000;\000\000\000\000\000\000\000?<\017\000\000\000\000\000\260\"\022~\375\177\000\000\263'/\256\345\177\000\000\254|\204U\000\000\000\000\330\"\022~\375\177\000"}
        in_addrlen = 16
        fd = 47
        pid = 0
        unique_id = 14598152785591174327
        __FUNCTION__ = "smbd_accept_connection"
#29 0x00007fe5ae2f2d26 in run_events_poll (ev=0x7fe5b2e368c0, pollrtn=1, pfds=0x7fe5b37ad7e0, num_pfds=8) at ../source3/lib/events.c:257
        pfd = 0x7fe5b37ad810
        flags = 1
        state = 0x7fe5b2e378d0
        pollfd_idx = 0x7fe5b30e3420
        fde = 0x7fe5b363cca0
        __FUNCTION__ = "run_events_poll"
#30 0x00007fe5ae2f2fb5 in s3_event_loop_once (ev=0x7fe5b2e368c0, location=0x7fe5b0d4beea "../source3/smbd/server.c:985") at ../source3/lib/events.c:326
        state = 0x7fe5b2e378d0
        timeout = 59999
        num_pfds = 8
        ret = 1
        poll_errno = 0
#31 0x00007fe5af8be539 in _tevent_loop_once (ev=0x7fe5b2e368c0, location=0x7fe5b0d4beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:533
        ret = 0
        nesting_stack_ptr = 0x0
#32 0x00007fe5af8be783 in tevent_common_loop_wait (ev=0x7fe5b2e368c0, location=0x7fe5b0d4beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:637
        ret = 0
#33 0x00007fe5af8be84e in _tevent_loop_wait (ev=0x7fe5b2e368c0, location=0x7fe5b0d4beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:656
No locals.
#34 0x00007fe5b0d47f81 in smbd_parent_loop (ev_ctx=0x7fe5b2e368c0, parent=0x7fe5b2e36b30) at ../source3/smbd/server.c:985
        trace_state = {frame = 0x7fe5b2e375b0}
        ret = 0
        __FUNCTION__ = "smbd_parent_loop"
#35 0x00007fe5b0d498df in main (argc=4, argv=0x7ffd7e122848) at ../source3/smbd/server.c:1626
        is_daemon = true
        interactive = false
        Fork = false
        no_process_group = false
        log_stdout = false
        ports = 0x0
        profile_level = 0x0
        opt = -1
        pc = 0x7fe5b2e27100
        print_build_options = false
        long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7fe5ad16c3c0 <poptHelpOptions>, val = 0, descrip = 0x7fe5b0d4bfe9 "Help options:", argDescrip = 0x0}, {longName = 0x7fe5b0d4bff7 "daemon", shortName = 68 'D', argInfo = 0, arg = 0x0, val = 1000, descrip = 0x7fe5b0d4bffe "Become a daemon (default)", argDescrip = 0x0}, {longName = 0x7fe5b0d4c018 "interactive", shortName = 105 'i', argInfo = 0, arg = 0x0, val = 1001, descrip = 0x7fe5b0d4c028 "Run interactive (not a daemon)", argDescrip = 0x0}, {longName = 0x7fe5b0d4c047 "foreground", shortName = 70 'F', argInfo = 0, arg = 0x0, val = 1002, descrip = 0x7fe5b0d4c058 "Run daemon in foreground (for daemontools, etc.)", argDescrip = 0x0}, {longName = 0x7fe5b0d4c089 "no-process-group", shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 1003, descrip = 0x7fe5b0d4c0a0 "Don't create a new process group", argDescrip = 0x0}, {longName = 0x7fe5b0d4c0c1 "log-stdout", shortName = 83 'S', argInfo = 0, arg = 0x0, val = 1004, descrip = 0x7fe5b0d4c0cc "Log to stdout", argDescrip = 0x0}, {longName = 0x7fe5b0d4c0da "build-options", shortName = 98 'b', argInfo = 0, arg = 0x0, val = 98, descrip = 0x7fe5b0d4c0e8 "Print build options", argDescrip = 0x0}, {longName = 0x7fe5b0d4c0fc "port", shortName = 112 'p', argInfo = 1, arg = 0x7ffd7e122430, val = 0, descrip = 0x7fe5b0d4c101 "Listen on the specified ports", argDescrip = 0x0}, {longName = 0x7fe5b0d4c11f "profiling-level", shortName = 80 'P', argInfo = 1, arg = 0x7ffd7e122438, val = 0, descrip = 0x7fe5b0d4c12f "Set profiling level", argDescrip = 0x7fe5b0d4c143 "PROFILE_LEVEL"}, {longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7fe5ae96d380 <popt_common_samba>, val = 0, descrip = 0x7fe5b0d4c151 "Common samba options:", argDescrip = 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, descrip = 0x0, argDescrip = 0x0}}
        parent = 0x7fe5b2e36b30
        frame = 0x7fe5b2e255e0
        status = {v = 0}
        ev_ctx = 0x7fe5b2e368c0
        msg_ctx = 0x7fe5b2e369b0
        server_id = {pid = 7812, task_id = 0, vnn = 4294967295, unique_id = 7911962480482536927}
        se = 0x7fe5b2e41ca0
        np_dir = 0x7fe5b48e8410 "dNSTombstoned"
        smbd_shim_fns = {cancel_pending_lock_requests_by_fid = 0x7fe5b023682e <smbd_cancel_pending_lock_requests_by_fid>, send_stat_cache_delete_message = 0x7fe5b0240f24 <smbd_send_stat_cache_delete_message>, change_to_root_user = 0x7fe5b021df68 <smbd_change_to_root_user>, become_authenticated_pipe_user = 0x7fe5b021e01e <smbd_become_authenticated_pipe_user>, unbecome_authenticated_pipe_user = 0x7fe5b021e110 <smbd_unbecome_authenticated_pipe_user>, contend_level2_oplocks_begin = 0x7fe5b02b3341 <smbd_contend_level2_oplocks_begin>, contend_level2_oplocks_end = 0x7fe5b02b33b4 <smbd_contend_level2_oplocks_end>, become_root = 0x7fe5b021e330 <smbd_become_root>, unbecome_root = 0x7fe5b021e358 <smbd_unbecome_root>, exit_server = 0x7fe5b02a69b4 <smbd_exit_server>, exit_server_cleanly = 0x7fe5b02a69d1 <smbd_exit_server_cleanly>}
        __FUNCTION__ = "main"
A debugging session is active.

        Inferior 1 [process 1765] will be detached.

Quit anyway? (y or n) [answered Y; input not from terminal]
Comment 1 Nick Semenkovich 2015-08-07 01:35:41 UTC
FYI, I've now seen this a bunch -- about 3-5 segfaults/day with 20 clients.


Seems to have increased sharply since updating a few clients to Windows 10 & enabling roaming profiles (previously, all clients were Win 8.1, only mapped drives, no roaming profiles).

- Ubuntu 15.10
- All clients are ~Win 10 / Win 8.1 w/ all patches
- Domain enforces client signing

Here's one of the ~5 segfaults from today, now running v4-2-test (s3-passdb: Respect LOOKUP_NAME_GROUP flag in sid lookup | 98ac8fc3d968e39).



From the logs, it looks like a client:
- Logs off
- The profile remains open/locked somehow?
- Eventually the profile is closed (Unclear how -- some timeout? Usually, the computer is asleep shortly after logoff.)


Logs:

[2015/08/06 17:27:53.242610,  2] ../source3/smbd/open.c:1005(open_file)
  CORP\xxx opened file xxx.V5/ntuser.ini read=Yes write=Yes (numopen=2)
[2015/08/06 17:27:53.245953,  2] ../source3/smbd/close.c:780(close_normal_file)
  CORP\xxx closed file xxx.V5/ntuser.ini (numopen=1) NT_STATUS_OK
[2015/08/06 17:27:53.246893,  2] ../source3/smbd/open.c:1005(open_file)
  CORP\xxx opened file xxx.V5/ntuser.ini read=Yes write=No (numopen=2)
[2015/08/06 17:27:53.249788,  2] ../source3/smbd/open.c:1005(open_file)
  CORP\xxx opened file xxx.V5/ntuser.ini read=No write=No (numopen=3)
[2015/08/06 17:27:53.250540,  2] ../source3/smbd/close.c:780(close_normal_file)
  CORP\xxx closed file xxx.V5/ntuser.ini (numopen=2) NT_STATUS_OK
[2015/08/06 17:27:53.252614,  2] ../source3/smbd/open.c:1005(open_file)
  CORP\xxx opened file xxx.V5/NTUSER.DAT read=Yes write=Yes (numopen=3)
[2015/08/06 18:34:13.782702,  2] ../source3/smbd/close.c:780(close_normal_file)
  CORP\xxx closed file xxx.V5/NTUSER.DAT (numopen=2) NT_STATUS_OK
[2015/08/06 18:34:13.782895,  2] ../source3/smbd/close.c:780(close_normal_file)
  CORP\xxx closed file xxx.V5/ntuser.ini (numopen=1) NT_STATUS_OK
[2015/08/06 18:34:13.783061,  2] ../source3/smbd/service.c:1138(close_cnum)
  192.168.0.108 (ipv4:192.168.0.108:64956) closed connection to service profiles
[2015/08/06 18:34:13.783388,  0] ../lib/util/fault.c:78(fault_report)
  ===============================================================
[2015/08/06 18:34:13.783476,  0] ../lib/util/fault.c:79(fault_report)
  INTERNAL ERROR: Signal 11 in pid 22145 (4.2.3)
  Please read the Trouble-Shooting section of the Samba HOWTO
[2015/08/06 18:34:13.783506,  0] ../lib/util/fault.c:81(fault_report)
  ===============================================================
[2015/08/06 18:34:13.783532,  0] ../source3/lib/util.c:788(smb_panic_s3)
  PANIC (pid 22145): internal error
[2015/08/06 18:34:13.784565,  0] ../source3/lib/util.c:899(log_stack_trace)
  BACKTRACE: 37 stack frames:
   #0 /usr/local/samba/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7fc0c61c28bc]
   #1 /usr/local/samba/lib/libsmbconf.so.0(smb_panic_s3+0x6f) [0x7fc0c61c2707]
   #2 /usr/local/samba/lib/libsamba-util.so.0(smb_panic+0x28) [0x7fc0c85d8f2a]
   #3 /usr/local/samba/lib/libsamba-util.so.0(+0x2ac02) [0x7fc0c85d8c02]
   #4 /usr/local/samba/lib/libsamba-util.so.0(+0x2ac17) [0x7fc0c85d8c17]
   #5 /lib/x86_64-linux-gnu/libpthread.so.0(+0x10d10) [0x7fc0c87f8d10]
   #6 /usr/local/samba/lib/private/libsmbd-base-samba4.so(file_close_user+0x14) [0x7fc0c809e596]
   #7 /usr/local/samba/lib/private/libsmbd-base-samba4.so(smbXsrv_session_logoff+0x493) [0x7fc0c818fb17]
   #8 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x1bf693) [0x7fc0c818e693]
   #9 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x1bf6b3) [0x7fc0c818e6b3]
   #10 /usr/local/samba/lib/private/libtalloc.so.2(+0x345c) [0x7fc0c7bc245c]
   #11 /usr/local/samba/lib/private/libtalloc.so.2(+0x4593) [0x7fc0c7bc3593]
   #12 /usr/local/samba/lib/private/libtalloc.so.2(+0x360d) [0x7fc0c7bc260d]
   #13 /usr/local/samba/lib/private/libtalloc.so.2(+0x4593) [0x7fc0c7bc3593]
   #14 /usr/local/samba/lib/private/libtalloc.so.2(+0x360d) [0x7fc0c7bc260d]
   #15 /usr/local/samba/lib/private/libtalloc.so.2(_talloc_free+0x105) [0x7fc0c7bc39a0]
   #16 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x1c8a14) [0x7fc0c8197a14]
   #17 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x1c8b52) [0x7fc0c8197b52]
   #18 /usr/local/samba/lib/private/libsmbd-shim-samba4.so(exit_server_cleanly+0x28) [0x7fc0c5b7ee70]
   #19 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x193fef) [0x7fc0c8162fef]
   #20 /usr/local/samba/lib/private/libsmbd-base-samba4.so(+0x19be26) [0x7fc0c816ae26]
   #21 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x54f) [0x7fc0c61e3764]
   #22 /usr/local/samba/lib/libsmbconf.so.0(+0x459f3) [0x7fc0c61e39f3]
   #23 /usr/local/samba/lib/private/libtevent.so.0(_tevent_loop_once+0xf4) [0x7fc0c77af589]
   #24 /usr/local/samba/lib/private/libtevent.so.0(tevent_common_loop_wait+0x25) [0x7fc0c77af7d3]
   #25 /usr/local/samba/lib/private/libtevent.so.0(_tevent_loop_wait+0x2b) [0x7fc0c77af89e]
   #26 /usr/local/samba/lib/private/libsmbd-base-samba4.so(smbd_process+0xb28) [0x7fc0c814c082]
   #27 /usr/local/samba/sbin/smbd(+0xb16b) [0x7fc0c8c3716b]
   #28 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x54f) [0x7fc0c61e3764]
   #29 /usr/local/samba/lib/libsmbconf.so.0(+0x459f3) [0x7fc0c61e39f3]
   #30 /usr/local/samba/lib/private/libtevent.so.0(_tevent_loop_once+0xf4) [0x7fc0c77af589]
   #31 /usr/local/samba/lib/private/libtevent.so.0(tevent_common_loop_wait+0x25) [0x7fc0c77af7d3]
   #32 /usr/local/samba/lib/private/libtevent.so.0(_tevent_loop_wait+0x2b) [0x7fc0c77af89e]
   #33 /usr/local/samba/sbin/smbd(+0xbf81) [0x7fc0c8c37f81]
   #34 /usr/local/samba/sbin/smbd(main+0x17a7) [0x7fc0c8c398df]
   #35 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fc0c4aa8a40]
   #36 /usr/local/samba/sbin/smbd(_start+0x29) [0x7fc0c8c31ed9]
[2015/08/06 18:34:13.792751,  0] ../source3/lib/util.c:800(smb_panic_s3)
  smb_panic(): calling panic action [/home/semenko/panic-action 22145]
31	../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory.
[2015/08/06 18:34:16.579625,  0] ../source3/lib/util.c:808(smb_panic_s3)
  smb_panic(): action returned status 0
[2015/08/06 18:34:16.579758,  0] ../source3/lib/dumpcore.c:318(dump_core)
  dumping core in /usr/local/samba/var/cores/smbd



[full trace below]



[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fc0c4b5284a in __GI___waitpid (pid=29880, stat_loc=stat_loc@entry=0x7ffc711a2250, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
#0  0x00007fc0c4b5284a in __GI___waitpid (pid=29880, stat_loc=stat_loc@entry=0x7ffc711a2250, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
        resultvar = 18446744073709551104
        oldtype = <optimized out>
        result = <optimized out>
#1  0x00007fc0c4acbffb in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
        __result = <optimized out>
        _buffer = {__routine = 0x7fc0c4acc2f0 <cancel_handler>, __arg = 0x7ffc711a222c, __canceltype = 0, __prev = 0x0}
        _avail = 1
        status = 0
        save = <optimized out>
        pid = 29880
        sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {65536, 0 <repeats 15 times>}}, sa_flags = 0, sa_restorer = 0x7fc0ca601b10}
        omask = {__val = {7296, 140465963581884, 140465982254256, 140465982254256, 140722206024608, 140465978678960, 140722206029600, 140465976381584, 140465978678960, 140722206029600, 0, 0, 0, 140465976465821, 1, 0}}
#2  0x00007fc0c61c27c2 in smb_panic_s3 (why=0x7fc0c85dfaed "internal error") at ../source3/lib/util.c:801
        cmd = 0x7fc0ca601b10 "/home/semenko/panic-action 22145"
        result = 0
        __FUNCTION__ = "smb_panic_s3"
#3  0x00007fc0c85d8f2a in smb_panic (why=0x7fc0c85dfaed "internal error") at ../lib/util/fault.c:166
No locals.
#4  0x00007fc0c85d8c02 in fault_report (sig=11) at ../lib/util/fault.c:83
        counter = 1
        __FUNCTION__ = "fault_report"
#5  0x00007fc0c85d8c17 in sig_fault (sig=11) at ../lib/util/fault.c:94
No locals.
#6  <signal handler called>
No locals.
#7  0x00007fc0c809e596 in file_close_user (sconn=0x0, vuid=1519202524) at ../source3/smbd/files.c:250
        fsp = 0x7fc000000000
        next = 0x7fc0c9ad6970
#8  0x00007fc0c818fb17 in smbXsrv_session_logoff (session=0x7fc0c928ae60) at ../source3/smbd/smbXsrv_session.c:1583
        table = 0x7fc0c9a5c910
        local_rec = 0x0
        global_rec = 0x0
        sconn = 0x0
        status = {v = 0}
        error = {v = 0}
        __FUNCTION__ = "smbXsrv_session_logoff"
#9  0x00007fc0c818e693 in smbXsrv_session_clear_and_logoff (session=0x7fc0c928ae60) at ../source3/smbd/smbXsrv_session.c:1118
        status = {v = 0}
        xconn = 0x0
#10 0x00007fc0c818e6b3 in smbXsrv_session_destructor (session=0x7fc0c928ae60) at ../source3/smbd/smbXsrv_session.c:1126
        status = {v = 3374886496}
        __FUNCTION__ = "smbXsrv_session_destructor"
#11 0x00007fc0c7bc245c in _talloc_free_internal (ptr=0x7fc0c928ae60, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:993
        d = 0x7fc0c818e69b <smbXsrv_session_destructor>
        tc = 0x7fc0c928ae00
        ptr_to_free = 0x7fc0ca08ea60
#12 0x00007fc0c7bc3593 in _talloc_free_children_internal (tc=0x7fc0c9a5c8b0, ptr=0x7fc0c9a5c910, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:1472
        child = 0x7fc0c928ae60
        new_parent = 0x7fc0c8f92500
#13 0x00007fc0c7bc260d in _talloc_free_internal (ptr=0x7fc0c9a5c910, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:1019
        tc = 0x7fc0c9a5c8b0
        ptr_to_free = 0x7fc0caa24370
#14 0x00007fc0c7bc3593 in _talloc_free_children_internal (tc=0x7fc0c99bb260, ptr=0x7fc0c99bb2c0, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:1472
        child = 0x7fc0c9a5c910
        new_parent = 0x7fc0c8f92500
#15 0x00007fc0c7bc260d in _talloc_free_internal (ptr=0x7fc0c99bb2c0, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:1019
        tc = 0x7fc0c99bb260
        ptr_to_free = 0x7fc0c9e8d6d0
#16 0x00007fc0c7bc39a0 in _talloc_free (ptr=0x7fc0c99bb2c0, location=0x7fc0c82ffaa0 "../source3/smbd/server_exit.c:233") at ../lib/talloc/talloc.c:1594
        tc = 0x7fc0c99bb260
#17 0x00007fc0c8197a14 in exit_server_common (how=SERVER_EXIT_NORMAL, reason=0x7fc0c6f92be3 "NT_STATUS_IO_TIMEOUT") at ../source3/smbd/server_exit.c:233
        client = 0x0
        xconn = 0x0
        sconn = 0x0
        msg_ctx = 0x7fc0c8fa39b0
        __FUNCTION__ = "exit_server_common"
#18 0x00007fc0c8197b52 in smbd_exit_server_cleanly (explanation=0x7fc0c6f92be3 "NT_STATUS_IO_TIMEOUT") at ../source3/smbd/server_exit.c:266
No locals.
#19 0x00007fc0c5b7ee70 in exit_server_cleanly (reason=0x7fc0c6f92be3 "NT_STATUS_IO_TIMEOUT") at ../source3/lib/smbd_shim.c:131
No locals.
#20 0x00007fc0c8162fef in smbd_server_connection_terminate_ex (xconn=0x7fc0c9b4f070, reason=0x7fc0c6f92be3 "NT_STATUS_IO_TIMEOUT", location=0x7fc0c82efa98 "../source3/smbd/smb2_server.c:3506") at ../source3/smbd/smb2_server.c:1050
        __FUNCTION__ = "smbd_server_connection_terminate_ex"
#21 0x00007fc0c816ae26 in smbd_smb2_connection_handler (ev=0x7fc0c8fa38c0, fde=0x7fc0c9530270, flags=1, private_data=0x7fc0c9b4f070) at ../source3/smbd/smb2_server.c:3506
        xconn = 0x7fc0c9b4f070
        status = {v = 3221225653}
#22 0x00007fc0c61e3764 in run_events_poll (ev=0x7fc0c8fa38c0, pollrtn=1, pfds=0x7fc0c97f35c0, num_pfds=5) at ../source3/lib/events.c:257
        pfd = 0x7fc0c97f35e0
        flags = 1
        state = 0x7fc0c8fa48d0
        pollfd_idx = 0x7fc0c92d38e0
        fde = 0x7fc0c9530270
        __FUNCTION__ = "run_events_poll"
#23 0x00007fc0c61e39f3 in s3_event_loop_once (ev=0x7fc0c8fa38c0, location=0x7fc0c82e7050 "../source3/smbd/process.c:3992") at ../source3/lib/events.c:326
        state = 0x7fc0c8fa48d0
        timeout = 60000
        num_pfds = 5
        ret = 1
        poll_errno = 0
#24 0x00007fc0c77af589 in _tevent_loop_once (ev=0x7fc0c8fa38c0, location=0x7fc0c82e7050 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:533
        ret = 0
        nesting_stack_ptr = 0x0
#25 0x00007fc0c77af7d3 in tevent_common_loop_wait (ev=0x7fc0c8fa38c0, location=0x7fc0c82e7050 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:637
        ret = 0
#26 0x00007fc0c77af89e in _tevent_loop_wait (ev=0x7fc0c8fa38c0, location=0x7fc0c82e7050 "../source3/smbd/process.c:3992") at ../lib/tevent/tevent.c:656
No locals.
#27 0x00007fc0c814c082 in smbd_process (ev_ctx=0x7fc0c8fa38c0, msg_ctx=0x7fc0c8fa39b0, sock_fd=46, interactive=false) at ../source3/smbd/process.c:3992
        trace_state = {frame = 0x7fc0ca4d8c80, smbd_idle_profstamp = 0}
        client = 0x7fc0c99bb2c0
        sconn = 0x7fc0c9e8d730
        xconn = 0x7fc0c9b4f070
        locaddr = 0x7fc0c8fa3d40 "\020[D\311\300\177"
        remaddr = 0x7fc0c94fc020 ""
        ret = 32704
        status = {v = 0}
        __FUNCTION__ = "smbd_process"
#28 0x00007fc0c8c3716b in smbd_accept_connection (ev=0x7fc0c8fa38c0, fde=0x7fc0c95e8400, flags=1, private_data=0x7fc0c949e0d0) at ../source3/smbd/server.c:627
        status = {v = 0}
        s = 0x0
        msg_ctx = 0x7fc0c8fa39b0
        addr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 16 times>, "\070\346|\311\300\177\000\000p1\032q\374\177\000\000\360\060\032q\374\177\000\000\232V]\310\300\177\000\000\070\346|\311\300\177\000\000p1\032q\374\177\000\000\036\000\000\000\000\000\000\000ij\n\000\000\000\000\000\220\061\032q\374\177\000\000\361\061\036\306\300\177\000\000\200s\303U\000\000\000\000\270\061\032q\374\177\000"}
        in_addrlen = 16
        fd = 46
        pid = 0
        unique_id = 17362296649519182078
        __FUNCTION__ = "smbd_accept_connection"
#29 0x00007fc0c61e3764 in run_events_poll (ev=0x7fc0c8fa38c0, pollrtn=1, pfds=0x7fc0c97f35c0, num_pfds=8) at ../source3/lib/events.c:257
        pfd = 0x7fc0c97f35f0
        flags = 1
        state = 0x7fc0c8fa48d0
        pollfd_idx = 0x7fc0c9f23960
        fde = 0x7fc0c95e8400
        __FUNCTION__ = "run_events_poll"
#30 0x00007fc0c61e39f3 in s3_event_loop_once (ev=0x7fc0c8fa38c0, location=0x7fc0c8c3beea "../source3/smbd/server.c:985") at ../source3/lib/events.c:326
        state = 0x7fc0c8fa48d0
        timeout = 30683
        num_pfds = 8
        ret = 1
        poll_errno = 0
#31 0x00007fc0c77af589 in _tevent_loop_once (ev=0x7fc0c8fa38c0, location=0x7fc0c8c3beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:533
        ret = 0
        nesting_stack_ptr = 0x0
#32 0x00007fc0c77af7d3 in tevent_common_loop_wait (ev=0x7fc0c8fa38c0, location=0x7fc0c8c3beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:637
        ret = 0
#33 0x00007fc0c77af89e in _tevent_loop_wait (ev=0x7fc0c8fa38c0, location=0x7fc0c8c3beea "../source3/smbd/server.c:985") at ../lib/tevent/tevent.c:656
No locals.
#34 0x00007fc0c8c37f81 in smbd_parent_loop (ev_ctx=0x7fc0c8fa38c0, parent=0x7fc0c8fa3b30) at ../source3/smbd/server.c:985
        trace_state = {frame = 0x7fc0c8fa45b0}
        ret = 0
        __FUNCTION__ = "smbd_parent_loop"
#35 0x00007fc0c8c398df in main (argc=4, argv=0x7ffc711a3728) at ../source3/smbd/server.c:1626
        is_daemon = true
        interactive = false
        Fork = false
        no_process_group = false
        log_stdout = false
        ports = 0x0
        profile_level = 0x0
        opt = -1
        pc = 0x7fc0c8f94100
        print_build_options = false
        long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7fc0c505d3c0 <poptHelpOptions>, val = 0, descrip = 0x7fc0c8c3bfe9 "Help options:", argDescrip = 0x0}, {longName = 0x7fc0c8c3bff7 "daemon", shortName = 68 'D', argInfo = 0, arg = 0x0, val = 1000, descrip = 0x7fc0c8c3bffe "Become a daemon (default)", argDescrip = 0x0}, {longName = 0x7fc0c8c3c018 "interactive", shortName = 105 'i', argInfo = 0, arg = 0x0, val = 1001, descrip = 0x7fc0c8c3c028 "Run interactive (not a daemon)", argDescrip = 0x0}, {longName = 0x7fc0c8c3c047 "foreground", shortName = 70 'F', argInfo = 0, arg = 0x0, val = 1002, descrip = 0x7fc0c8c3c058 "Run daemon in foreground (for daemontools, etc.)", argDescrip = 0x0}, {longName = 0x7fc0c8c3c089 "no-process-group", shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 1003, descrip = 0x7fc0c8c3c0a0 "Don't create a new process group", argDescrip = 0x0}, {longName = 0x7fc0c8c3c0c1 "log-stdout", shortName = 83 'S', argInfo = 0, arg = 0x0, val = 1004, descrip = 0x7fc0c8c3c0cc "Log to stdout", argDescrip = 0x0}, {longName = 0x7fc0c8c3c0da "build-options", shortName = 98 'b', argInfo = 0, arg = 0x0, val = 98, descrip = 0x7fc0c8c3c0e8 "Print build options", argDescrip = 0x0}, {longName = 0x7fc0c8c3c0fc "port", shortName = 112 'p', argInfo = 1, arg = 0x7ffc711a3310, val = 0, descrip = 0x7fc0c8c3c101 "Listen on the specified ports", argDescrip = 0x0}, {longName = 0x7fc0c8c3c11f "profiling-level", shortName = 80 'P', argInfo = 1, arg = 0x7ffc711a3318, val = 0, descrip = 0x7fc0c8c3c12f "Set profiling level", argDescrip = 0x7fc0c8c3c143 "PROFILE_LEVEL"}, {longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7fc0c685e380 <popt_common_samba>, val = 0, descrip = 0x7fc0c8c3c151 "Common samba options:", argDescrip = 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, descrip = 0x0, argDescrip = 0x0}}
        parent = 0x7fc0c8fa3b30
        frame = 0x7fc0c8f925e0
        status = {v = 0}
        ev_ctx = 0x7fc0c8fa38c0
        msg_ctx = 0x7fc0c8fa39b0
        server_id = {pid = 2891, task_id = 0, vnn = 4294967295, unique_id = 10128480815932138165}
        se = 0x7fc0c8faeca0
        np_dir = 0x7fc0ca0b3300 "\002"
        smbd_shim_fns = {cancel_pending_lock_requests_by_fid = 0x7fc0c81278cc <smbd_cancel_pending_lock_requests_by_fid>, send_stat_cache_delete_message = 0x7fc0c8131fc2 <smbd_send_stat_cache_delete_message>, change_to_root_user = 0x7fc0c810f006 <smbd_change_to_root_user>, become_authenticated_pipe_user = 0x7fc0c810f0bc <smbd_become_authenticated_pipe_user>, unbecome_authenticated_pipe_user = 0x7fc0c810f1ae <smbd_unbecome_authenticated_pipe_user>, contend_level2_oplocks_begin = 0x7fc0c81a44a5 <smbd_contend_level2_oplocks_begin>, contend_level2_oplocks_end = 0x7fc0c81a4518 <smbd_contend_level2_oplocks_end>, become_root = 0x7fc0c810f3ce <smbd_become_root>, unbecome_root = 0x7fc0c810f3f6 <smbd_unbecome_root>, exit_server = 0x7fc0c8197b18 <smbd_exit_server>, exit_server_cleanly = 0x7fc0c8197b35 <smbd_exit_server_cleanly>}
        __FUNCTION__ = "main"
A debugging session is active.

        Inferior 1 [process 22145] will be detached.

Quit anyway? (y or n) [answered Y; input not from terminal]
Comment 2 Richard Sharpe 2015-11-06 20:45:21 UTC
We believe we have hit this three times in the last day or so as well at Nutanix.
Comment 3 Richard Sharpe 2015-11-06 21:16:37 UTC
In our case it has a slightly different stack, in that it's coming from a shutdown message to the smbd.
Comment 4 Richard Sharpe 2015-11-19 00:14:15 UTC
The reason for this bug is a miss-ordering of delete actions.

Here is the relevant part of the stack trace from one of our stack traces:
-----------------------------------
#6 <signal handler called>
#7 0x00007f47ba82da1a in file_close_user (sconn=0x0, vuid=1584077283) at ../source3/smbd/files.c:250
#8 0x00007f47ba922a74 in smbXsrv_session_logoff (session=0x7f47be8bbf80) at ../source3/smbd/smbXsrv_session.c:1404
#9 0x00007f47ba921912 in smbXsrv_session_destructor (session=0x7f47be8bbf80) at ../source3/smbd/smbXsrv_session.c:1068
#10 0x00007f47b784e2fc in _talloc_free_internal () from /usr/lib/libtalloc.so.2
#11 0x00007f47b784f495 in _talloc_free_children_internal () from /usr/lib/libtalloc.so.2
#12 0x00007f47b784e49f in _talloc_free_internal () from /usr/lib/libtalloc.so.2
#13 0x00007f47b784f495 in _talloc_free_children_internal () from /usr/lib/libtalloc.so.2
#14 0x00007f47b784e49f in _talloc_free_internal () from /usr/lib/libtalloc.so.2
#15 0x00007f47b784f88e in _talloc_free () from /usr/lib/libtalloc.so.2
#16 0x00007f47ba92b2f1 in exit_server_common (how=SERVER_EXIT_NORMAL, reason=0x0) at ../source3/smbd/server_exit.c:234
------------------------------------------------

The line numbers will differ a small amount for different builds.

Here is the relevant code:

-8<-----------------------------8<-----------------------

        /*
         * we need to force the order of freeing the following,
         * because smbd_msg_ctx is not a talloc child of smbd_server_conn.
         */
        if (client != NULL) {
                struct smbXsrv_connection *next;

                for (; xconn != NULL; xconn = next) {
                        next = xconn->next;
                        DLIST_REMOVE(client->connections, xconn);
                        talloc_free(xconn);
                        DO_PROFILE_INC(disconnect);
                }
                TALLOC_FREE(client->sconn); /* Here we NULL out sconn! */
        }
        sconn = NULL;
        xconn = NULL;
        client = NULL;
        TALLOC_FREE(global_smbXsrv_client); /* Here we call a destructor that needs sconn! */
        smbprofile_dump();
        server_messaging_context_free();
        server_event_context_free();
        TALLOC_FREE(smbd_memcache_ctx);

        locking_end();
        printing_end();
----------------------------------------------

I think perhaps the correct approach is to call file_close_user(client->sconn) before TALLOC_FREE(client->sconn) and pull that statement from the destructor.
Comment 5 Richard Sharpe 2015-11-19 00:19:25 UTC
Unfortunately, there is another use of client->sconn in  smbXsrv_session_logoff:

----------------------------------------
        if (session->compat) {
                file_close_user(sconn, session->compat->vuid); /* HERE */
        }

        if (session->tcon_table != NULL) {
                /*
                 * Note: We only have a tcon_table for SMB2.
                 */
                status = smb2srv_tcon_disconnect_all(session);
                if (!NT_STATUS_IS_OK(status)) {
                        DEBUG(0, ("smbXsrv_session_logoff(0x%08x): "
                                  "smb2srv_tcon_disconnect_all() failed: %s\n",
                                  session->global->session_global_id,
                                  nt_errstr(status)));
                        error = status;
                }
        }

        if (session->compat) {
                invalidate_vuid(sconn, session->compat->vuid); /* HERE */
                session->compat = NULL;
        }
--------------------------------------------

Perhaps we can protect these against NULL and also clean these up in the exit handler as suggested above.
Comment 6 Jeremy Allison 2015-11-19 01:12:29 UTC
Does the following fix the immediate crash ? (Not tested, I'll look closer tomorrow).

diff --git a/source3/smbd/smbXsrv_session.c b/source3/smbd/smbXsrv_session.c
index 9f8520a..54d6338 100644
--- a/source3/smbd/smbXsrv_session.c
+++ b/source3/smbd/smbXsrv_session.c
@@ -1696,7 +1696,7 @@ NTSTATUS smbXsrv_session_logoff(struct smbXsrv_session *session)
        }
        session->db_rec = NULL;
 
-       if (session->compat) {
+       if (session->compat != NULL && sconn != NULL) {
                file_close_user(sconn, session->compat->vuid);
        }
 
@@ -1714,7 +1714,7 @@ NTSTATUS smbXsrv_session_logoff(struct smbXsrv_session *session)
                }
        }
 
-       if (session->compat) {
+       if (session->compat != NULL && sconn != NULL) {
                invalidate_vuid(sconn, session->compat->vuid);
                session->compat = NULL;
        }

I think we need to explicitly call smbXsrv_session_logoff() before deleting sconn in the shutdown path.
Comment 7 Jeremy Allison 2015-11-20 23:22:50 UTC
(In reply to Jeremy Allison from comment #6)

Ah. I'm wrong. We *already* call smbXsrv_session_clear_and_logoff() from inside exit_server_common() here:

        if (xconn != NULL) {
                NTSTATUS status;

                /*
                 * Note: this is a no-op for smb2 as
                 * conn->tcon_table is empty
                 */
                status = smb1srv_tcon_disconnect_all(xconn);
                if (!NT_STATUS_IS_OK(status)) {
                        DEBUG(0,("Server exit (%s)\n",
                                (reason ? reason : "normal exit")));
                        DEBUG(0, ("exit_server_common: "
                                  "smb1srv_tcon_disconnect_all() failed (%s) - "
                                  "triggering cleanup\n", nt_errstr(status)));
                        how = SERVER_EXIT_ABNORMAL;
                        reason = "smb1srv_tcon_disconnect_all failed";
                }

HERE BELOW !!!!!!!!!!!!!!!!!!!

                status = smbXsrv_session_logoff_all(xconn);
                if (!NT_STATUS_IS_OK(status)) {
                        DEBUG(0,("Server exit (%s)\n",
                                (reason ? reason : "normal exit")));
                        DEBUG(0, ("exit_server_common: "
                                  "smbXsrv_session_logoff_all() failed (%s) - "
                                  "triggering cleanup\n", nt_errstr(status)));
                        how = SERVER_EXIT_ABNORMAL;
                        reason = "smbXsrv_session_logoff_all failed";
                }
        }

smbXsrv_session_logoff_all() calls -> smbXsrv_session_logoff_all_callback() -> calls smbXsrv_session_clear_and_logoff() calls -> smbXsrv_session_logoff()

Which means when we TALLOC_FREE(global_smbXsrv_client) later on inside exit_server_common(), this indirectly calls smbXsrv_session_destructor() calls -> smbXsrv_session_clear_and_logoff() calls -> smbXsrv_session_logoff().

In other words smbXsrv_session_logoff() is being called twice in the teardown and is not idempotent (as the second time session->client->sconn is now NULL).
Comment 8 Jeremy Allison 2015-11-21 01:08:24 UTC
Created attachment 11610 [details]
git-am test patch for master.

Richard, can you test this possible fix ? I'm still looking for the correct one but this might be it.
Comment 9 Jeremy Allison 2015-11-21 01:22:23 UTC
(In reply to Jeremy Allison from comment #7)

(NB. The above comment about smbXsrv_session_logoff() being called twice is wrong. That can't happen. It's the explicit TALLOC_FREE(global_smbXsrv_client->sconn) before the TALLOC_FREE(global_smbXsrv_client) that causes the trouble.
Comment 10 Richard Sharpe 2015-11-23 20:37:07 UTC
(In reply to Jeremy Allison from comment #8)

That looks like almost the same as we came up with.

I will get the guy who did the change reply.
Comment 11 Shyam Rathi 2015-11-23 23:46:34 UTC
@Jeremy:

We have come up with almost the same solution here at Nutanix. Only change is that I'm calling invalidate_vuid even if sconn is NULL as it sconn is not used inside this function.

diff --git a/source3/smbd/smbXsrv_session.c b/source3/smbd/smbXsrv_session.c
index c5b7b79..27c5395 100644
--- a/source3/smbd/smbXsrv_session.c
+++ b/source3/smbd/smbXsrv_session.c
@@ -1400,7 +1400,7 @@ NTSTATUS smbXsrv_session_logoff(struct smbXsrv_session *session)
        }
        session->db_rec = NULL;

-       if (session->compat) {
+       if (sconn && session->compat) {
                file_close_user(sconn, session->compat->vuid);
        }

@@ -1419,7 +1419,9 @@ NTSTATUS smbXsrv_session_logoff(struct smbXsrv_session *session)
        }

        if (session->compat) {
-               invalidate_vuid(sconn, session->compat->vuid);
+               if (sconn) {
+                       invalidate_vuid(sconn, session->compat->vuid);
+               }
                session->compat = NULL;
        }
Comment 12 Shyam Rathi 2015-11-23 23:58:03 UTC
(In reply to Shyam Rathi from comment #11)
Can't edit my comment, so clarifying an error in it.

I'm keeping 'session->compat = NULL' statement even when sconn is NULL. 

Apology for the mistake in my last comment.
Comment 13 Jeremy Allison 2015-11-24 00:35:39 UTC
(In reply to Shyam Rathi from comment #12)

Ah, no - the fix I wanted you to look at was this one:

diff --git a/source3/smbd/server_exit.c b/source3/smbd/server_exit.c
index bf50394..80f118a 100644
--- a/source3/smbd/server_exit.c
+++ b/source3/smbd/server_exit.c
@@ -221,7 +221,6 @@ static void exit_server_common(enum server_exit_reason how,
 			talloc_free(xconn);
 			DO_PROFILE_INC(disconnect);
 		}
-		TALLOC_FREE(client->sconn);
 	}
 	sconn = NULL;
 	xconn = NULL;
-- 
2.6.0.rc2.230.g3dd15c0

I think this is the correct one, rather than changing source3/smbd/smbXsrv_session.c.
Comment 14 Richard Sharpe 2015-11-24 01:47:54 UTC
(In reply to Jeremy Allison from comment #13)

That was my first thought on how to fix the problem, but I didn't do a thorough analysis of the problem.
Comment 15 Shyam Rathi 2015-11-24 19:29:33 UTC
(In reply to Jeremy Allison from comment #13)
Thanks Jeremy. I'll apply this one in my branch.
Comment 16 Stefan Metzmacher 2015-12-07 13:24:14 UTC
Comment on attachment 11610 [details]
git-am test patch for master.

Jeremy, can we revert this (8024e19b70047865249305bceddd4473d6e60051) for master again?
Comment 17 Jeremy Allison 2015-12-07 16:58:33 UTC
Yep - done. I'll upload the correct patches for this for 4.3.next, 4.2.next.
Comment 18 Stefan Metzmacher 2015-12-07 17:04:54 UTC
(In reply to Jeremy Allison from comment #17)

They're already ready on bug #11394
Comment 19 Jeremy Allison 2015-12-07 17:11:17 UTC
Closing this out. Let's track on #11394.

*** This bug has been marked as a duplicate of bug 11394 ***