Bug 14428 - PANIC: assert failed in get_lease_type()
Summary: PANIC: assert failed in get_lease_type()
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.11.6
Hardware: All All
: P5 regression (vote)
Target Milestone: 4.13
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-02 13:02 UTC by Ralph Böhme
Modified: 2020-09-18 08:34 UTC (History)
6 users (show)

See Also:


Attachments
Patch for 4.11 backported from master (8.42 KB, patch)
2020-07-03 09:14 UTC, Ralph Böhme
metze: review+
Details
Patch for 4.12 cherry-picked from master (7.92 KB, patch)
2020-07-03 09:15 UTC, Ralph Böhme
metze: review+
Details
Patches for v4-12-test (12.57 KB, patch)
2020-07-28 07:18 UTC, Stefan Metzmacher
slow: review+
metze: ci-passed+
Details
Patches for v4-11-test (13.49 KB, patch)
2020-07-28 14:06 UTC, Stefan Metzmacher
slow: review+
metze: ci-passed+
Details
Patches for v4-13-test (15.50 KB, patch)
2020-08-31 14:35 UTC, Stefan Metzmacher
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralph Böhme 2020-07-02 13:02:25 UTC
The following PANIC was seen on a Samba 4.11.6 server

[2020/06/12 10:24:25.432271,  0] ../../source3/smbd/oplock.c:192(get_lease_type)
  PANIC: assert failed at ../../source3/smbd/oplock.c(192): NT_STATUS_IS_OK(status)
[2020/06/12 10:24:25.432333,  0] ../../source3/lib/util.c:824(smb_panic_s3)
  PANIC (pid 29560): assert failed: NT_STATUS_IS_OK(status)
[2020/06/12 10:24:25.433465,  0] ../../lib/util/fault.c:265(log_stack_trace)
  BACKTRACE: 32 stack frames:
   #0 /usr/lib/x86_64-linux-gnu/samba/libsamba-util.so.0(log_stack_trace+0x2d) [0x7fd874a00c8f]
   #1 /usr/lib/x86_64-linux-gnu/samba/libsmbconf.so.0(smb_panic_s3+0x54) [0x7fd8730b97a2]
   #2 /usr/lib/x86_64-linux-gnu/samba/libsamba-util.so.0(smb_panic+0x2c) [0x7fd874a00d88]
   #3 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(get_lease_type+0xbe) [0x7fd8745f723c]
   #4 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(+0x119e06) [0x7fd874597e06]
   #5 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(+0x11cfa0) [0x7fd87459afa0]
   #6 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(+0x11f253) [0x7fd87459d253]
   #7 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(create_file_default+0x477) [0x7fd87459e9e4]
   #8 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(+0x2143da) [0x7fd8746923da]
   #9 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(smb_vfs_call_create_file+0x73) [0x7fd8745a5225]
   #10 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(smbd_smb2_request_process_create+0x1df2) [0x7fd8745d6d34]
   #11 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(smbd_smb2_request_dispatch+0x1676) [0x7fd8745cbbe8]
   #12 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(+0x14f541) [0x7fd8745cd541]
   #13 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(tevent_common_invoke_fd_handler+0x81) [0x7fd873c69d5c]
   #14 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0xc1e3) [0x7fd873c701e3]
   #15 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0x9eaa) [0x7fd873c6deaa]
   #16 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(_tevent_loop_once+0xa3) [0x7fd873c69582]
   #17 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(tevent_common_loop_wait+0x17) [0x7fd873c69770]
   #18 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0x9e5a) [0x7fd873c6de5a]
   #19 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(_tevent_loop_wait+0xa) [0x7fd873c697d8]
   #20 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so(smbd_process+0x984) [0x7fd8745bb372]
   #21 /usr/sbin/smbd(+0xc0c2) [0x55ac5bbf40c2]
   #22 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(tevent_common_invoke_fd_handler+0x81) [0x7fd873c69d5c]
   #23 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0xc1e3) [0x7fd873c701e3]
   #24 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0x9eaa) [0x7fd873c6deaa]
   #25 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(_tevent_loop_once+0xa3) [0x7fd873c69582]
   #26 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(tevent_common_loop_wait+0x17) [0x7fd873c69770]
   #27 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(+0x9e5a) [0x7fd873c6de5a]
   #28 /usr/lib/x86_64-linux-gnu/samba/libtevent.so.0(_tevent_loop_wait+0xa) [0x7fd873c697d8]
   #29 /usr/sbin/smbd(main+0x1c92) [0x55ac5bbf5df3]
   #30 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fd870621830]

No futher debugging information was available.

The crash in get_lease_type() is triggered by a missing or invalid record in leases_db.tdb.

Not known how exaclty how we end up with a locking.tdb record referencing a non-existing leases_db.tdb record, all we can do for now is

* add a check for stale share_more_entry owner

* add some useful debugging to aid in analysis in the future

Have patch, need bugnumber.
Comment 1 Ralph Böhme 2020-07-03 09:14:32 UTC
Created attachment 16113 [details]
Patch for 4.11 backported from master
Comment 2 Ralph Böhme 2020-07-03 09:15:20 UTC
Created attachment 16114 [details]
Patch for 4.12 cherry-picked from master
Comment 3 Stefan Metzmacher 2020-07-06 13:09:54 UTC
There're will be a few more fixes, see
https://gitlab.com/samba-team/samba/-/merge_requests/1447

I also found that the
smb2.durable-v2-delay.durable_v2_reconnect_delay_msec triggers this
(at least in master).

There's seems to be a bug in share_mode_cleanup_disconnected(),
        data->have_share_modes = false;
        data->modified = true;
is not enough to delete the record in master.

The following naive patch to seems to fix it for
smb2.durable-v2-delay.durable_v2_reconnect_delay_msec:

diff --git a/source3/locking/share_mode_lock.c b/source3/locking/share_mode_lock.c
index a93141304b26..fceb09f7f2d3 100644
--- a/source3/locking/share_mode_lock.c
+++ b/source3/locking/share_mode_lock.c
@@ -664,9 +664,16 @@ static NTSTATUS share_mode_data_store(struct share_mode_data *d)
 
        d->sequence_number += 1;
 
-       status = locking_tdb_data_fetch(key, d, &ltdb);
-       if (!NT_STATUS_IS_OK(status)) {
-               return status;
+       if (d->have_share_modes) {
+               status = locking_tdb_data_fetch(key, d, &ltdb);
+               if (!NT_STATUS_IS_OK(status)) {
+                       return status;
+               }
+       } else {
+               ltdb = talloc_zero(d, struct locking_tdb_data);
+               if (ltdb == NULL) {
+                       return NT_STATUS_NO_MEMORY;
+               }
        }
 
        if (ltdb->num_share_entries != 0) {

But I'm not sure it's correct.
Comment 4 Jeremy Allison 2020-07-06 18:20:03 UTC
Dammit, I think I found this a couple of weeks ago when writing the test for async disconnect but didn't track it down to this issue. My "force disconnect" test ended up leaving a dead record in locking.tdb which I never tracked down fully (I cheated by moving the force-disconnect after the one that was complaining about the dead record). Sorry.
Comment 5 Volker Lendecke 2020-07-07 15:15:00 UTC
(In reply to Stefan Metzmacher from comment #3)
> But I'm not sure it's correct.

During a phone call Stefan, Ralph and I agreed that it would be easier to understand if share_mode_cleanup_disconnected() walked the share mode entries again using share_mode_forall_entries(), setting all disconnected entries to stale. share_mode_forall_entries() would then take care of removing the locking.tdb record. For durable file handles this is no performance penalty, as there will only ever be one of those, and even for a later implementation of persistent file handles this should not cause trouble as this is an asynchronous process independent of normal smbd operations.
Comment 6 Stefan Metzmacher 2020-07-28 07:18:19 UTC
Created attachment 16141 [details]
Patches for v4-12-test
Comment 7 Stefan Metzmacher 2020-07-28 14:06:59 UTC
Created attachment 16144 [details]
Patches for v4-11-test
Comment 8 Ralph Böhme 2020-07-30 15:08:10 UTC
Reassigning to Karolin for inclusion in 4.11 and 4.12.
Comment 9 Karolin Seeger 2020-08-06 11:20:23 UTC
(In reply to Ralph Böhme from comment #8)
Pushed to autobuild-v4-{12,11}-test.
Comment 10 Karolin Seeger 2020-08-11 07:19:26 UTC
(In reply to Karolin Seeger from comment #9)
Pushed to both branches.
Closing out bug report.

Thanks!
Comment 11 Stefan Metzmacher 2020-08-28 13:36:39 UTC
One origin that triggered this bug, is a regression in 4.13.

In 4.13 share_mode_cleanup_disconnected() doesn't delete the
locking.tdb entry.
Comment 12 Stefan Metzmacher 2020-08-31 14:35:02 UTC
Created attachment 16190 [details]
Patches for v4-13-test
Comment 13 Karolin Seeger 2020-09-01 07:22:16 UTC
(In reply to Stefan Metzmacher from comment #12)
Pushed to autobuild-v4-13-test.
Comment 14 Samba QA Contact 2020-09-03 14:56:05 UTC
this bug was referenced in commit df1ff55deb92827eb502c1857a0039743752c6b3 in the samba v4-13-test branch
Comment 15 Samba QA Contact 2020-09-03 15:46:07 UTC
This bug was referenced in samba v4-13-test commit 4aa4f12f5333bd4913989ae1f54027e480535eb4
Comment 16 Samba QA Contact 2020-09-03 15:46:08 UTC
This bug was referenced in samba v4-13-test commit 0a682a1898695839fe4933af61c8a1d4068005a3
Comment 17 Samba QA Contact 2020-09-03 15:46:11 UTC
This bug was referenced in samba v4-13-test commit 431192896a2e31487dcd5155e35ac333bb4dd472
Comment 18 Samba QA Contact 2020-09-03 15:46:14 UTC
This bug was referenced in samba v4-13-test commit df1ff55deb92827eb502c1857a0039743752c6b3
Comment 19 Samba QA Contact 2020-09-03 15:46:16 UTC
This bug was referenced in samba v4-13-test commit b83efaec1981600668fae5f67a483c2c698b6e68
Comment 20 Samba QA Contact 2020-09-18 08:34:01 UTC
This bug was referenced in samba v4-13-stable:

431192896a2e31487dcd5155e35ac333bb4dd472
e62a37e9748839f8f474b5a28c607f6630b66c71
4aa4f12f5333bd4913989ae1f54027e480535eb4
8a8b90eba76b94f3d7291ccc86fe1e80d4423ff3
b83efaec1981600668fae5f67a483c2c698b6e68
0a682a1898695839fe4933af61c8a1d4068005a3
df1ff55deb92827eb502c1857a0039743752c6b3