Created attachment 16549 [details] smbd logs I have an smb share with enabled kernel oplocks. Two clients open/close the same file in a loop: >>> while True: ... f = open('s:/file') ... f.close() One of the smbd processes panics on double-free: [2021/03/15 18:11:58.687372, 0, pid=26201, effective(65534, 65534), real(65534, 0)] ../../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn) talloc: access after free error - first free may be at ../../source3/smbd/open.c:2909 [2021/03/15 18:11:58.687839, 0, pid=26201, effective(65534, 65534), real(65534, 0)] ../../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn) Bad talloc magic value - access after free (gdb) bt #0 0x00007f2f1c0ebf47 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f2f1c0ed8b1 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007f2f1feec915 in dump_core () at ../../source3/lib/dumpcore.c:338 #3 0x00007f2f1fedc240 in smb_panic_s3 (why=0x7f2f1f872680 "Bad talloc magic value - access after free") at ../../source3/lib/util.c:861 #4 0x00007f2f209ce6e0 in smb_panic (why=0x7f2f1f872680 "Bad talloc magic value - access after free") at ../../lib/util/fault.c:174 #5 0x00007f2f1f86dcb1 in talloc_abort (reason=0x7f2f1f872680 "Bad talloc magic value - access after free") at ../../lib/talloc/talloc.c:508 #6 0x00007f2f1f86dcc4 in talloc_abort_access_after_free () at ../../lib/talloc/talloc.c:513 #7 0x00007f2f1f86dd49 in talloc_chunk_from_ptr (ptr=0x7f2f04d05d00) at ../../lib/talloc/talloc.c:530 #8 0x00007f2f1f870171 in _talloc_free (ptr=0x7f2f04d05d00, location=0x7f2f206bd480 "../../source3/smbd/open.c:2811") at ../../lib/talloc/talloc.c:1767 #9 0x00007f2f204d1dac in poll_open_fn (ev=0x7f2f04cb2560, te=0x7f2f04d1cf60, current_time=..., private_data=0x7f2f04da33a0) at ../../source3/smbd/open.c:2811 #10 0x00007f2f1f25d9fc in tevent_common_invoke_timer_handler (te=0x7f2f04d1cf60, current_time=..., removed=0x0) at ../../lib/tevent/tevent_timed.c:370 #11 0x00007f2f1f25dbff in tevent_common_loop_timer_delay (ev=0x7f2f04cb2560) at ../../lib/tevent/tevent_timed.c:442 #12 0x00007f2f1f25f8b9 in epoll_event_loop (epoll_ev=0x7f2f04c2c580, tvalp=0x7fff85b401a0) at ../../lib/tevent/tevent_epoll.c:667 #13 0x00007f2f1f2601cf in epoll_event_loop_once (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_epoll.c:937 #14 0x00007f2f1f25c931 in std_event_loop_once (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:110 #15 0x00007f2f1f2542a4 in _tevent_loop_once (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:772 #16 0x00007f2f1f2545f6 in tevent_common_loop_wait (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:895 #17 0x00007f2f1f25c9d3 in std_event_loop_wait (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:141 #18 0x00007f2f1f254699 in _tevent_loop_wait (ev=0x7f2f04cb2560, location=0x7f2f206cd478 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:914 #19 0x00007f2f205056a1 in smbd_process (ev_ctx=0x7f2f04cb2560, msg_ctx=0x7f2f04cf9320, sock_fd=9, interactive=false) at ../../source3/smbd/process.c:4170 #20 0x0000563b62d082bb in smbd_accept_connection (ev=0x7f2f04cb2560, fde=0x7f2f04cfa9e0, flags=1, private_data=0x7f2f04d656c0) at ../../source3/smbd/server.c:1019 The issue is easily reproducible if I add sleep(1) into poll_open_done() and run the simple script above from two clients. The 1st free was in poll_open_done(). I guess the problem is that poll_open_done() frees open_rec (and its child open_rec->watch_req), but not open_rec->te. Deferred open_file() can fail again with NT_STATUS_NETWORK_BUSY, so the open will be deferred for the 2nd time. But now the original timer can fire and call poll_open_fn() that will try to free open_rec->watch_req again. Attached are logs from both smbd processes.
With which versions are you getting the crashes?
(In reply to Ralph Böhme from comment #1) I was using 4.12.9, but now reproduced also with 4.14.0. Attached new logs. -Lev.
Created attachment 16550 [details] smbd 4.14.0 logs
Created attachment 16552 [details] Possible patch for master (and older versions) Lev, can you test with this patch? It fixes the issue in my environment where I was able to reproduce the problem.
Created attachment 16554 [details] Possible patch for master (and older versions) Previous version was incomplete and did not fix the root cause of the crash, this one should do it. The core fix is making sure we don't leave a dangling pointer in poll_open_done() when freeing the sharemode watch subreq. We have a copy of that pointer we just freed in open_rec->watch_req and the open retry timer function access this pointer to free the watch_req which will either crash or do things much worse, like freeing a completely random talloc object (seen crashes caused by this on another system).
(In reply to Ralph Böhme from comment #5) Thanks, Ralph, I applied the patchset in my environment and it fixed the panic. -Lev.
This bug was referenced in samba master: 065ed088b3d5710c288e46a5bf1e063f9a29c8cc 171a58ff3e8ee07cf5d7af08eabcb4a7379e7ce5 591c9196962b695b01c0d86918b8f8a263e9665c
Created attachment 16557 [details] git-am fix for 4.14.next, 4.13.next, 4.12.next. Cherry-picked from master.
Reassigning to Karolin for inclusion in 4.13 and 4.14.
Pushed to autobuild-v4-{14,13}-test.
This bug was referenced in samba v4-13-test: f8d67bc3d7d4c2cf1a16a67072fdd097044072dd 27cd9103dc68bf5a23026eb1be75127f0bc831cd 42e7b36454db64120b9940c42592f2fa6d668ad7
This bug was referenced in samba v4-14-test: da71738e987310f19d63d2b575c354156dadbf8f ebec84c886e7d7807609a3492e02fd1347898aba 051585ef3616fc845ebbcbcf18c46f9d0cab2b00
Pushed to both branches. Closing out bug report. Thanks!
This bug was referenced in samba v4-14-stable (Release samba-4.14.3): da71738e987310f19d63d2b575c354156dadbf8f ebec84c886e7d7807609a3492e02fd1347898aba 051585ef3616fc845ebbcbcf18c46f9d0cab2b00
This bug was referenced in samba v4-13-stable: f8d67bc3d7d4c2cf1a16a67072fdd097044072dd 27cd9103dc68bf5a23026eb1be75127f0bc831cd 42e7b36454db64120b9940c42592f2fa6d668ad7