I am able to recreate this smb panic with the help of smbtorture test. This issue actually related to alternate data streams. In one of our customer core dumps, we could see the file is having alternate data stream. Here are the steps to recreate the panic: 1) create some file say oplock.txt. 2) create a stream on the file. ex: oplock.txt:stream1 3) Open file oplock.txt with BATCH_OPLOCK. do not close the handle 4) Open the stream file(oplock.txt:stream1) with or without oplock. Do not close the handle here as well. 5) Open the base file (oplock.txt) again. You will see smb panic. Just to remind we are using samba 3.6.12+ stack. I have reproduced this issue on a freenas server as well which is running samba 3.6.12 stack. But did not see much changes in Master code specific to this area. We should be hitting the same issue. Log snippet during panic on freenas: parse_share_modes: owrt: Fri Sep 5 14:53:12 2014 PDT, cwrt: Wed Dec 31 16:00:00 1969 PST, ntok: 0, num_share_modes: 2 [2014/09/05 15:01:38.881370, 10] locking/locking.c:725(parse_share_modes) parse_share_modes: share_mode_entry[0]: pid = 5178, share_access = 0x7, private_options = 0x0, access_mask = 0x120089, mid = 0x4, type= 0x2, gen_id = 1836070363, uid = 0, flags = 0, file _id 1e7aff2c:3616f:0, name_hash = 0x2ab3c7 [2014/09/05 15:01:38.881384, 10] locking/locking.c:725(parse_share_modes) parse_share_modes: share_mode_entry[1]: pid = 5178, share_access = 0x7, private_options = 0x0, access_mask = 0x0, mid = 0x0, type= 0x0, gen_id = 1836070364, uid = 0, flags = 0, file_id 1 e7aff2c:3616f:0, name_hash = 0x2ab3c7 [2014/09/05 15:01:38.881394, 0] lib/util.c:1117(smb_panic) PANIC (pid 5178): Bad no oplock entry. I think this is what its happening.. When we successfully open base file step#3, we register a shared mode lock with BATCH_OPLOCK on base file. When we try to open the stream file in step#4, we are doing an internal open on base file with access mask as zero, oplock as zero(NO_OPLOCK) and req as NULL in create_file_unixpath(). In create_file_unixpath() .... /* Open the base file. */ status = create_file_unixpath(conn, NULL, smb_fname_base, 0, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, base_create_disposition, 0, 0, 0, 0, 0, NULL, NULL, &base_fsp, NULL); TALLOC_FREE(smb_fname_base); ... But we avoid breaking existing batch oplock as we treat this as internal open. In open_file_ntcreate().. /* First pass - send break only on batch oplocks. */ if ((req != NULL) && ===> we skip if req is NULL. delay_for_batch_oplocks(fsp, req->mid, oplock_request, batch_entry)) { schedule_defer_open(lck, request_time, req); TALLOC_FREE(lck); return NT_STATUS_SHARING_VIOLATION; } But this entry is also getting added to share mode lock list for the base file. Now we have a BATCH_OPLOCK and NO_OPLOCK on base file oplock.txt. Next open request on base file is causing to traverse the list and leading to panic situation. I have observed that we are actually setting the flag "INTERNAL_OPEN_ONLY" in create_file_unixpath when we see the req as NULL. ... if (req == NULL) { oplock_request |= INTERNAL_OPEN_ONLY; } ... But in open_file_ntcreate(), we are unconditionally clearing this INTERNAL_OPEN_ONLY flag. ... /* Ensure no SAMBA_PRIVATE bits can be set. */ fsp->oplock_type = (oplock_request & ~SAMBA_PRIVATE_OPLOCK_MASK); .... Due to this, we never treated this shared lock entry as internal. In find_oplock_types() we have the checks.. ... if ((oplock_request & INTERNAL_OPEN_ONLY) || is_stat_open(fsp->access_mask)) { return; } .... if (lck->share_modes[i].op_type == NO_OPLOCK && is_stat_open(lck->share_modes[i].access_mask)) { /* We ignore stat opens in the table - they always have NO_OPLOCK and never get or cause breaks. JRA. */ continue; } ... This particular shared lock is skipping all these checks. It was neither having INTERNAL_OPEN_FLAG nor a stat open(since access mask is zero). I want to know.. a) If its intentional to clear the SAMBA_PRIVATE_OPLOCK_MASK before adding the lock to share mode list. b) Why are we not using access mask as (SYNCHRONIZE_ACCESS|FILE_READ_ATTRIBUTES| FILE_WRITE_ATTRIBUTES) which will mark the open as stat_open and will allow to skip this share lock check which will avoid panic.
stack looks like this but line numbers will not be matched with exact 3.6.12 as we taken few critical fixes in between. (gdb) bt #0 0x0000000802cf3ffc in thr_kill () from /lib/libc.so.7 #1 0x0000000802d8f58b in abort () from /lib/libc.so.7 #2 0x00000000007925d1 in dump_core () at lib/fault.c:414 #3 0x00000000007a246f in smb_panic (why=<optimized out>) at lib/util.c:1133 #4 0x0000000000502276 in find_oplock_types (fsp=<optimized out>, oplock_request=<optimized out>, lck=0x803781e50, pp_batch=0x7fffffffc5d0, pp_ex_or_batch=0x7fffffffc5d8, got_level2=0x7fffffffc62d, got_no_oplock=0x7fffffffc62e) at smbd/open.c:1112 #5 0x0000000000507c5e in open_file_ntcreate (fsp=<optimized out>, pinfo=<optimized out>, private_flags=<optimized out>, oplock_request=0, new_dos_attributes=0, create_options=<optimized out>, create_disposition=<optimized out>, share_access=<optimized out>, access_mask=65664, req=<optimized out>, conn=<optimized out>) at smbd/open.c:1942 #6 create_file_unixpath (conn=0x803753c50, req=0x803f28e00, smb_fname=0x803f28f90, access_mask=65664, share_access=7, create_disposition=1, create_options=<optimized out>, file_attributes=0, oplock_request=0, allocation_size=0, private_flags=0, sd=0x0, ea_list=0x0, result=0x7fffffffc6d8, pinfo=0x7fffffffc6e4) at smbd/open.c:3426 #7 0x0000000000508f4b in create_file_default (conn=0x803753c50, req=0x803f28e00, root_dir_fid=0, smb_fname=0x803f28f90, access_mask=65664, share_access=7, create_disposition=1, create_options=2097216, file_attributes=0, oplock_request=0, allocation_size=0, private_flags=0, sd=0x0, ea_list=0x0, result=0x7fffffffe328, pinfo=0x7fffffffe0d4) at smbd/open.c:3814
Created attachment 10262 [details] SMB2 torture oplock test to recreate the issue.
Created attachment 10285 [details] git-am fix for 4.1.next Fix that went into master - applies cleanly to 4.1.next. Torture test would need to be back-ported to 4.1.x (doesn't apply). Fix would need to be back-ported to 4.0.x also, not sure we need to do that but I'll let Volker decide. Jeremy.
Re-assigning to Karolin for inclusion in 4.1.next. Jeremy.
Pushed to autobuild-v4-1-test.
It seems like this patch breaks make test. Could someone please investigate? Thanks! From stdout: [499/1596 in 40m15s] samba3.smb2.oplock(s3dc) 0: Opening stream: oplock_test\test_stream1.txt:Stream One:$DATA with 9 1: Opening stream: oplock_test\test_stream1.txt::$DATA with 9 2: Opening stream: oplock_test\test_stream1.txt:Stream One:$DATA with 8 3: Opening stream: oplock_test\test_stream1.txt::$DATA with 8 Opening base file: oplock_test\test_stream1.txt with 9 4: Opening stream: oplock_test\test_stream1.txt:Stream One:$DATA with 9 Opening base file: oplock_test\test_stream1.txt with 9 5: Opening stream: oplock_test\test_stream1.txt::$DATA with 9 Acking to level II [0x01] in oplock handler Opening base file: oplock_test\test_stream1.txt with 9 6: Opening stream: oplock_test\test_stream1.txt:Stream One:$DATA with 8 Opening base file: oplock_test\test_stream1.txt with 9 7: Opening stream: oplock_test\test_stream1.txt::$DATA with 8 Acking to level II [0x01] in oplock handler Opening stream: oplock_test\test_stream1.txt:Stream One:$DATA with 8 Opening base file: oplock_test\test_stream1.txt with 9 Opening stream again: oplock_test\test_stream1.txt with 9 Acking to level II [0x01] in oplock handler UNEXPECTED(uxsuccess): samba3.smb2.oplock.stream1(s3dc) REASON: Content-Type: text/plain;charset=utf8 reason 0 FAILED (0 failures, 0 errors and 1 unexpected successes in 0 testsuites) A summary with detailed information can be found in: ./bin/ab/summary ERROR: test failed with exit code 1 From stderr: Traceback (most recent call last): File "/memdisk/kseeger/a41/b901549/samba/selftest/filter-subunit", line 93, in <module> ret = subunithelper.parse_results(msg_ops, statistics, sys.stdin) File "/memdisk/kseeger/a41/b901549/samba/selftest/subunithelper.py", line 183, in parse_results msg_ops.output_msg(l) File "/memdisk/kseeger/a41/b901549/samba/selftest/subunithelper.py", line 263, in output_msg sys.stdout.write(msg) IOError: [Errno 32] Broken pipe /memdisk/kseeger/a41/b901549/samba/selftest/selftest.pl: PID[9659]: Got SIGPIPE teardown environments. teardown_env(plugin_s4_dc) samba: EOF on stdin - terminating samba child process 38790 exited with value 0 teardown_env(dc) teardown_env(member) smbd child process 32474 exited with value 0 nmbd child process 32472 exited with value 0 winbindd child process 32473 exited with value 0 teardown_env(s4member) Exiting pid 33181 on SIGTERM smbd child process 32655 exited with value 0 nmbd child process 32653 exited with value 0 winbindd child process 32654 exited with value 0 teardown_env(maptoguest) smbd child process 34210 exited with value 0 nmbd child process 34208 exited with value 0 winbindd child process 34209 exited with value 0 teardown_env(s3dc) smbd child process 27828 exited with value 0 nmbd child process 27826 exited with value 0 winbindd child process 27827 exited with value 0 teardown_env(simpleserver) smbd child process 31850 exited with value 0 nmbd child process 31848 exited with value 0 winbindd child process 31849 exited with value 0 teardown_env(ktest) smbd child process 34000 exited with value 0 nmbd child process 33998 exited with value 0 winbindd child process 33999 exited with value 0 perl(9659)-+-pstree(61865) |-samba(28187) `-samba(33181) /memdisk/kseeger/a41/b901549/samba/selftest/selftest.pl: PID[9659]: Exiting... make: *** [test] Error 1
Created attachment 10318 [details] Patch
(In reply to comment #7) > Created attachment 10318 [details] > Patch This patch should fix it
Comment on attachment 10318 [details] Patch Ooohh. Unexpected success is good :-). LGTM !
Re-assigning back to Karolin for inclusion in 4.1.next. A test that was previously failing now passing is always good :-).
Thanks a lot, Volker and Jeremy!
Pushed both patches to autobuild-v4-1-test.
*** Bug 10283 has been marked as a duplicate of this bug. ***
Pushed to v4-1-test. Closing out bug report. Thanks!