Created attachment 16701 [details] smbd logs When the connection is force closed, all relevant file descriptors are freed: smbXsrv_tcon_disconnect() -> close_cnum() -> file_close_conn() -> close_file() -> file_free() -> fsp_free() If offload write finishes after the connection was already closed, it uses already freed fsp. The panic is easily reproducible, if enable talloc_fill and add some sleep in smb_vfs_call_offload_write_send(). On the Windows client I called CopyFile() and during the sleep issued `smbcontrol all close-share vol-1`. smbd panics with SIGSEGV: Thread 1 "smbd" received signal SIGSEGV, Segmentation fault. 0x00007f3ce6c5c31c in change_to_user_and_service (conn=0x4141414141410000, vuid=4702111237675155455) at ../../source3/smbd/uid.c:440 440 int snum = SNUM(conn); (gdb) bt #0 0x00007f3ce6c5c31c in change_to_user_and_service (conn=0x4141414141410000, vuid=4702111237675155455) at ../../source3/smbd/uid.c:440 #1 0x00007f3ce6c5c510 in change_to_user_and_service_by_fsp (fsp=0x7f3ccd8828e0) at ../../source3/smbd/uid.c:485 #2 0x00007f3ce6dd8e8d in vfswrap_offload_write_read_done (subreq=0x0) at ../../source3/modules/vfs_default.c:2194 #3 0x00007f3ce59ef91e in _tevent_req_notify_callback (req=0x7f3ccd8e9180, location=0x7f3ce6e5e340 "../../source3/smbd/vfs.c:1869") at ../../lib/tevent/tevent_req.c:141 #4 0x00007f3ce59efa7f in tevent_req_finish (req=0x7f3ccd8e9180, state=TEVENT_REQ_DONE, location=0x7f3ce6e5e340 "../../source3/smbd/vfs.c:1869") at ../../lib/tevent/tevent_req.c:193 #5 0x00007f3ce59efaac in _tevent_req_done (req=0x7f3ccd8e9180, location=0x7f3ce6e5e340 "../../source3/smbd/vfs.c:1869") at ../../lib/tevent/tevent_req.c:199 #6 0x00007f3ce6c7f3bb in smb_vfs_call_pread_done (subreq=0x0) at ../../source3/smbd/vfs.c:1869 #7 0x00007f3ce59ef91e in _tevent_req_notify_callback (req=0x7f3ccd8e9500, location=0x7f3ce6ee5940 "../../source3/modules/vfs_default.c:904") at ../../lib/tevent/tevent_req.c:141 #8 0x00007f3ce59efa7f in tevent_req_finish (req=0x7f3ccd8e9500, state=TEVENT_REQ_DONE, location=0x7f3ce6ee5940 "../../source3/modules/vfs_default.c:904") at ../../lib/tevent/tevent_req.c:193 #9 0x00007f3ce59efaac in _tevent_req_done (req=0x7f3ccd8e9500, location=0x7f3ce6ee5940 "../../source3/modules/vfs_default.c:904") at ../../lib/tevent/tevent_req.c:199 #10 0x00007f3ce6dd4db1 in vfs_pread_done (subreq=0x0) at ../../source3/modules/vfs_default.c:904 #11 0x00007f3ce59ef91e in _tevent_req_notify_callback (req=0x7f3ccd8e4f00, location=0x7f3cde52ab18 "../../lib/pthreadpool/pthreadpool_tevent.c:422") at ../../lib/tevent/tevent_req.c:141 #12 0x00007f3ce59efa7f in tevent_req_finish (req=0x7f3ccd8e4f00, state=TEVENT_REQ_DONE, location=0x7f3cde52ab18 "../../lib/pthreadpool/pthreadpool_tevent.c:422") at ../../lib/tevent/tevent_req.c:193 #13 0x00007f3ce59efaac in _tevent_req_done (req=0x7f3ccd8e4f00, location=0x7f3cde52ab18 "../../lib/pthreadpool/pthreadpool_tevent.c:422") at ../../lib/tevent/tevent_req.c:199 #14 0x00007f3cde525dfa in pthreadpool_tevent_job_done (ctx=0x7f3ccd8a1560, im=0x7f3ccd8e5150, private_data=0x7f3ccd8e50b0) at ../../lib/pthreadpool/pthreadpool_tevent.c:422 #15 0x00007f3ce59ee812 in tevent_common_invoke_immediate_handler (im=0x7f3ccd8e5150, removed=0x0) at ../../lib/tevent/tevent_immediate.c:166 #16 0x00007f3ce59ee918 in tevent_common_loop_immediate (ev=0x7f3ccd8a1560) at ../../lib/tevent/tevent_immediate.c:203 #17 0x00007f3ce59f9135 in epoll_event_loop_once (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_epoll.c:918 #18 0x00007f3ce59f5931 in std_event_loop_once (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:110 #19 0x00007f3ce59ed2a4 in _tevent_loop_once (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:772 #20 0x00007f3ce59ed5f6 in tevent_common_loop_wait (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:895 #21 0x00007f3ce59f59d3 in std_event_loop_wait (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:141 #22 0x00007f3ce59ed699 in _tevent_loop_wait (ev=0x7f3ccd8a1560, location=0x7f3ce6e67a98 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:914 #23 0x00007f3ce6c9ed46 in smbd_process (ev_ctx=0x7f3ccd8a1560, msg_ctx=0x7f3ccd8d9320, sock_fd=38, interactive=false) at ../../source3/smbd/process.c:4170 #24 0x0000556c18fcd39b in smbd_accept_connection (ev=0x7f3ccd8a1560, fde=0x7f3ccd8da9e0, flags=1, private_data=0x7f3ccd93d7a0) at ../../source3/smbd/server.c:1019 #25 0x00007f3ce59ee085 in tevent_common_invoke_fd_handler (fde=0x7f3ccd8da9e0, flags=1, removed=0x0) at ../../lib/tevent/tevent_fd.c:138 #26 0x00007f3ce59f8aec in epoll_event_loop (epoll_ev=0x7f3ccd82d660, tvalp=0x7ffc3d3a3d50) at ../../lib/tevent/tevent_epoll.c:736 #27 0x00007f3ce59f91cf in epoll_event_loop_once (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent_epoll.c:937 #28 0x00007f3ce59f5931 in std_event_loop_once (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent_standard.c:110 #29 0x00007f3ce59ed2a4 in _tevent_loop_once (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent.c:772 #30 0x00007f3ce59ed5f6 in tevent_common_loop_wait (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent.c:895 #31 0x00007f3ce59f59d3 in std_event_loop_wait (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent_standard.c:141 #32 0x00007f3ce59ed699 in _tevent_loop_wait (ev=0x7f3ccd8a1560, location=0x556c18fd5a08 "../../source3/smbd/server.c:1366") at ../../lib/tevent/tevent.c:914 #33 0x0000556c18fce152 in smbd_parent_loop (ev_ctx=0x7f3ccd8a1560, parent=0x7f3ccd8d9940) at ../../source3/smbd/server.c:1366 #34 0x0000556c18fd0647 in main (argc=3, argv=0x7ffc3d3a4378) at ../../source3/smbd/server.c:2226 (gdb) f 1 #1 0x00007f3ce6c5c510 in change_to_user_and_service_by_fsp (fsp=0x7f3ccd8828e0) at ../../source3/smbd/uid.c:485 485 return change_to_user_and_service(fsp->conn, fsp->vuid); (gdb) p/x *fsp $1 = {next = 0x0, prev = 0x0, fnum = 0x8041414141, op = 0x4141414141414141, conn = 0x4141414141410000, fh = 0x7f3ccd882980, num_smb_operations = 0x41414141, file_id = {devid = 0x4141414141414141, inode = 0xffffffffffffffff, extid = 0x0}, initial_allocation_size = 0x4141414141414141, file_pid = 0x4141, vuid = 0x41414141ffffffff, open_time = {tv_sec = 0x4141414141414141, tv_usec = 0x4141414141414141}, access_mask = 0xe2c2dee0, kernel_share_modes_taken = 0x3c, update_write_time_triggered = 0x7f, update_write_time_event = 0x7f3ce2c3c610, update_write_time_on_close = 0x60, close_write_time = {tv_sec = 0x7ffc3d3a1b80, tv_nsec = 0x7ffc3d3a1b88}, write_time_forced = 0x0, oplock_type = 0x0, leases_db_seqnum = 0x0, lease_type = 0x0, lease = 0x4141414141414141, sent_oplock_break = 0x41414141, oplock_timeout = 0x4141414141414141, last_lock_failure = {context = {smblctx = 0x4141414141414141, tid = 0x41414141, pid = {pid = 0x4141414141414141, task_id = 0x41414141, vnn = 0x41414141, unique_id = 0x4141414141414141}}, start = 0x4141414141414141, size = 0x4141414141414141, fnum = 0x4141414141414141, lock_type = 0x41414141, lock_flav = 0x41414141}, current_lock_count = 0x41414141, can_lock = 0x41, can_read = 0x41, can_write = 0x41, modified = 0x41, is_directory = 0x41, aio_write_behind = 0x41, initial_delete_on_close = 0x41, delete_on_close = 0x41, posix_flags = 0x4141414141414141, is_sparse = 0x41, backup_intent = 0x41, use_ofd_locks = 0x41, fsp_name = 0x4141414141414141, name_hash = 0x41414141, mid = 0x4141414141414141, vfs_extension = 0x4141414141414141, fake_file_handle = 0x4141414141414141, notify = 0x4141414141414141, base_fsp = 0x4141414141414141, share_mode_flags_seqnum = 0x41414141, share_mode_flags = 0x4141, brlock_seqnum = 0x41414141, brlock_rec = 0x4141414141414141, dptr = 0x4141414141414141, print_file = 0x4141414141414141, num_aio_requests = 0x41414141, aio_requests = 0x4141414141414141, closing = 0x41, blocked_smb1_lock_reqs = 0x4141414141414141, lock_failure_seen = 0x41, lock_failure_offset = 0x4141414141414141, deferred_close = 0x4141414141414141} If talloc_fill is not enabled, fsp->conn usually is NULL (if the memory was not reallocated), in this case change_to_user_and_service() returns false (with error message "Connection not open\n") and later smbd panics on assert in vfswrap_offload_write_cleanup(): [2021/07/25 17:43:21.838902, 1, pid=15161] ../../source3/smbd/uid.c:446(change_to_user_and_service) change_to_user_and_service: Connection not open [2021/07/25 17:43:21.838954, 0, pid=15161, class=vfs] ../../source3/modules/vfs_default.c:1964(vfswrap_offload_write_cleanup) PANIC: assert failed at ../../source3/modules/vfs_default.c(1964): ok
Created attachment 16712 [details] Raw patch for 4.12.next. Lev <lev@zadarastorage.com>. Can you test this patch for 4.12.next ? I have a similar one for master, but want to see if this fixes it for you first.
(In reply to Jeremy Allison from comment #1) Thanks, Jeremy, yes, the patch fixes the issue in my environment. -Lev.
Created attachment 16716 [details] git-am fix for master. Passes ci. MR is: https://gitlab.com/samba-team/samba/-/merge_requests/2103
Created attachment 16719 [details] git-am fix for master. Lev, Metze would like a more generic fix the prevents us from forgetting to add the aio to the fsp handle as we make more FSCTL calls async. Here is the generic fix (applies cleanly to 4.12). Can you give this a test to make sure it still fixes the problem for you whilst I work on adding the tests he requires ? Thanks ! Jeremy.
Created attachment 16720 [details] git-am fix for master. Generic fix requested by Metze, also with tests. The tests probably won't back-port, but the fix will. In CI at: https://gitlab.com/samba-team/devel/samba/-/pipelines/349475260 as MR: https://gitlab.com/samba-team/samba/-/merge_requests/2103
(In reply to Jeremy Allison from comment #4) Hi Jeremy, The latest (generic) patch fixes the issue as well. Thanks, Lev.
This bug was referenced in samba master: 6b6770c2ba83bf25da31623443c19a8de34e5ba4 62cd95096a76d5064b105c1b4971fa3eabd5f85d 0f4a8d26888ec156979a00480ed9886dcac7d426 c551d33c6bd2e74ea3a36bec5575a70d6833b98a 7e7ea761a37f46f758582981bc40404ffd815513 c013509680742ff45b2f5965a5564015da7d466b
Created attachment 16729 [details] git-am fix for 4.15.rcNext. Cherry-picked from master. Includes test. Back ports to 4.14 etc. won't include the test.
Created attachment 16730 [details] git-am fix for 4.14.next, 4.13.next. Back port (without test) of fix that went into master and will go into 4.15.rcNext.
Ping, can we get this into 4.15.0 ?
Reassigning to Jule for inclusion in 4.13, 4.14 and 4.15.
Pushed to autobuild-v4-{15,14,13}-test.
This bug was referenced in samba v4-15-test: 04af36c4916713001a3a8b72b81946eb7a084cd1 69c5ab711066871b82bbb6db67642d808d104775 68ceb6c8f05d6c12aa2e1618ac205a9740126458 24b661c01ef54a01c1e73abfd7628781693a3224 654430f6f6f1a0c300be77e215fdb95fb808bf4e 9a23ff2ca2b101cba5614b238afca0c58658fc77
This bug was referenced in samba v4-14-test: 618fd6c259458f97530fcde0fb6600e70e6f2478
This bug was referenced in samba v4-13-test: 7c9aabe2dd01e77442d95582f17037b006bac9dd
Closing out bug report. Thanks!
This bug was referenced in samba v4-14-stable (Release samba-4.14.7): 618fd6c259458f97530fcde0fb6600e70e6f2478
This bug was referenced in samba v4-15-stable (Release samba-4.15.0rc3): 04af36c4916713001a3a8b72b81946eb7a084cd1 69c5ab711066871b82bbb6db67642d808d104775 68ceb6c8f05d6c12aa2e1618ac205a9740126458 24b661c01ef54a01c1e73abfd7628781693a3224 654430f6f6f1a0c300be77e215fdb95fb808bf4e 9a23ff2ca2b101cba5614b238afca0c58658fc77
This bug was referenced in samba v4-13-stable (Release samba-4.13.11): 7c9aabe2dd01e77442d95582f17037b006bac9dd