Bug 15649 - Durable handle is not granted when a previous OPEN exists with NoOplock
Summary: Durable handle is not granted when a previous OPEN exists with NoOplock
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.20.1
Hardware: All All
: P5 major (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL: https://gitlab.com/samba-team/samba/-...
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-28 13:32 UTC by Sandeep Kanyal
Modified: 2024-11-19 14:48 UTC (History)
3 users (show)

See Also:


Attachments
Contains packet traces & a synthetic client to repro the issue (237.00 KB, application/x-gzip)
2024-05-28 13:32 UTC, Sandeep Kanyal
no flags Details
Patches for v4-21-test (124.78 KB, text/plain)
2024-10-10 14:07 UTC, Stefan Metzmacher
slow: review+
Details
Patches for v4-20-test (124.78 KB, text/plain)
2024-10-10 14:08 UTC, Stefan Metzmacher
slow: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sandeep Kanyal 2024-05-28 13:32:51 UTC
Created attachment 18302 [details]
Contains packet traces & a synthetic client to repro the issue

Found a bug in DH2Q handling. Scenario as below.

1. Client Sends an open request with Oplock: No oplock (0x00)
2. Open Succeeds.
3. Client sends another open request for same file but this time with Oplock: Lease (0xff) and also requests for a durable handle.
4. Open succeeds but Durable handle is not granted.

This causes the application to fail in case there is a temporary network outage as the client was not granted a durable handle. Tried the same scenario against windows server & the Durable handle is granted. Attached the packet traces for both windows server & samba server.

I have tried a potential fix. Post that durable handle is granted for above scenario. In the below code, ideally for the second OPEN req we should be going to the below block and set state->got_oplock = true. But instead we are comparing e->op_type != LEASE_OPLOCK and this condition is not satisfied as fist open doesn't have any oplocks. By putting (e->op_type != LEASE_OPLOCK && e->op_type != NO_OPLOCK), issue seems to be fixed.


static bool delay_for_oplock_fn(
struct share_mode_entry *e,
bool *modified,
void *private_data)
{
....

 if (!state->got_oplock &&
   (e->op_type != LEASE_OPLOCK) &&
   !share_entry_stale_pid(e)) {
state->got_oplock = true;
}

....
}

If this fix is sufficient, then i can create a PR.
Comment 1 Stefan Metzmacher 2024-05-28 13:42:33 UTC
(In reply to Sandeep Kanyal from comment #0)

I fear it's more complex than that, because the share mode entries need to be sorted on the durable reconnect otherwise the binary search used to find them
no longer works.

https://bugzilla.samba.org/show_bug.cgi?id=15624
https://gitlab.com/samba-team/samba/-/merge_requests/3583 is also
related to durable handle problems...
Comment 2 Sandeep Kanyal 2024-05-28 14:09:38 UTC
(In reply to Stefan Metzmacher from comment #1)
Thanks Stefan. How easy would it be to fix it ? My application is failing coz of this bug :(
Comment 3 Sandeep Kanyal 2024-05-28 14:16:29 UTC
(In reply to Sandeep Kanyal from comment #2)
Stefan, one question. I was debugging more for a 2nd bug (ill file that shortly) and i saw that we keep the entries in sorted order. But i vaguely remember that
we also free the entries in case it doesn't have oplock. So in this case, during
disconnect, wont the 1st open be removed from the smbXsrv_open_global.tdb as it doesn't have any oplocks ? (if i'm not wrong that the file where we keep the open files info)
Comment 4 Ralph Böhme 2024-05-28 15:03:50 UTC
(In reply to Sandeep Kanyal from comment #0)
The problem is that we should ignore the first open which is a stat open and don't grant the Durable Handle on the second open plus RH lease instead of just R lease.
Comment 5 Sandeep Kanyal 2024-05-28 15:44:18 UTC
(In reply to Ralph Böhme from comment #4)
I see. But then the application would always fail in case of disconnection. Actually i'm using a Microsoft application to write to samba share & it fails in case of network disruption (The server & client are cross geo). The problem is that it writes for 10+ hours (sometime 16+ hours) and then fails (Terabytes of data). And subsequent retries also fail as network is slow & patchy. If we can somehow support the durable handle for this scenario, that would be really great.
Comment 6 Ralph Böhme 2024-05-28 20:12:49 UTC
(In reply to Sandeep Kanyal from comment #5)
I guess you misunderstood what I was trying to say, let me try to rephrase my statement:

We likely should grant a DH for the second open and also grant a RH lease as requested by the client. The reason why the current code doesn't (wrongly) do this, is the first stat open.
Comment 7 Sandeep Kanyal 2024-05-29 03:27:26 UTC
(In reply to Ralph Böhme from comment #6)

Perfect. Yea the POC fix that i did, worked. It granted a DH & a RWH lease and application was happy
Comment 8 Samba QA Contact 2024-10-10 14:00:04 UTC
This bug was referenced in samba master:

e65e1326a0214a7dfff75ea1e528e82c8fc64517
9b2417c2f04857709c25e3665cd783a68edf0cf2
1cc1586d84a65046ab7804f17297c6964bb76c23
8884d617310b47375e38c0386433c5e183703454
0893ae88180137d44f17196234f657d362543ff5
77c7741f39a0a9789bede7c4722bd3f35d4af3fd
9e98cd5c7a180521026b0d73a330bdaf2c8af73a
2869bd1a507e7376f0bb0ec68ed4e045b043cfdb
b1e5f5d8d2852b66ca4c858d14d367ffe228a88d
dd5b9e08c7a98c54b62d3b097c75faa09cd17da7
Comment 9 Stefan Metzmacher 2024-10-10 14:07:38 UTC
Created attachment 18464 [details]
Patches for v4-21-test
Comment 10 Stefan Metzmacher 2024-10-10 14:08:23 UTC
Created attachment 18465 [details]
Patches for v4-20-test
Comment 11 Ralph Böhme 2024-10-14 08:28:18 UTC
Reassigning to Jule for inclusion in 4.20 and 4.21.
Comment 12 Jule Anger 2024-10-14 08:44:48 UTC
Pushed to autobuild-v4-{21,20}-test.
Comment 13 Samba QA Contact 2024-10-14 10:53:11 UTC
This bug was referenced in samba v4-20-test:

e14520172bdb4544137c8d6186767590d4a3e88f
989d0c486e38b225d4b34aa3472d4b3fd80e4899
6ea02f3765930bd793396d57af3ce9504c280774
ad0fb085464d0932944d23330df90d04a964f330
041f15c8a8e75de293beceffacba93feca1e51f8
02a4ccfb32e829f8ba70eecf4da786e4aa5e9acf
11903eb47623e67e5552e329564ee20df6f4863c
22682be22bd8b793e6635ff1d3ab77d7e4da1827
1e9bd54ef066325002587faa01ca3c13795929bc
24e89430b170e8239e9ef50c4e7b729cec12f3d7
Comment 14 Samba QA Contact 2024-10-14 11:10:04 UTC
This bug was referenced in samba v4-21-test:

706b26c88b516fe397358305169f980841bf26f1
7d158ba707f56c159676ab0ba92189ae15fff3cc
1d97e7cc2cf5308a1bd4d2b114ee45a7f8fc4645
ceb5bbc7e306727b3c7d1214b7548614f5dd7444
97542f40947b8815851a2cdb82aee73bb521cfef
710dc5dca507114e0c55c377cd5f5653007bc104
da144e3cf5c627f5da72adb8234bb28caef95ecf
fb406446b9574198cfa90e69981829fe550814cf
bd13b39b6de06e20808425a31a1aace9871c776c
84c6a02adc4e0ee39cd8b02a953592f1a30f3630
Comment 15 Jule Anger 2024-10-14 11:23:12 UTC
Closing out bug report.

Thanks!
Comment 16 Samba QA Contact 2024-10-14 11:33:09 UTC
This bug was referenced in samba v4-21-stable (Release samba-4.21.1):

706b26c88b516fe397358305169f980841bf26f1
7d158ba707f56c159676ab0ba92189ae15fff3cc
1d97e7cc2cf5308a1bd4d2b114ee45a7f8fc4645
ceb5bbc7e306727b3c7d1214b7548614f5dd7444
97542f40947b8815851a2cdb82aee73bb521cfef
710dc5dca507114e0c55c377cd5f5653007bc104
da144e3cf5c627f5da72adb8234bb28caef95ecf
fb406446b9574198cfa90e69981829fe550814cf
bd13b39b6de06e20808425a31a1aace9871c776c
84c6a02adc4e0ee39cd8b02a953592f1a30f3630
Comment 17 Samba QA Contact 2024-11-19 14:48:21 UTC
This bug was referenced in samba v4-20-stable (Release samba-4.20.6):

e14520172bdb4544137c8d6186767590d4a3e88f
989d0c486e38b225d4b34aa3472d4b3fd80e4899
6ea02f3765930bd793396d57af3ce9504c280774
ad0fb085464d0932944d23330df90d04a964f330
041f15c8a8e75de293beceffacba93feca1e51f8
02a4ccfb32e829f8ba70eecf4da786e4aa5e9acf
11903eb47623e67e5552e329564ee20df6f4863c
22682be22bd8b793e6635ff1d3ab77d7e4da1827
1e9bd54ef066325002587faa01ca3c13795929bc
24e89430b170e8239e9ef50c4e7b729cec12f3d7