Created attachment 18302 [details] Contains packet traces & a synthetic client to repro the issue Found a bug in DH2Q handling. Scenario as below. 1. Client Sends an open request with Oplock: No oplock (0x00) 2. Open Succeeds. 3. Client sends another open request for same file but this time with Oplock: Lease (0xff) and also requests for a durable handle. 4. Open succeeds but Durable handle is not granted. This causes the application to fail in case there is a temporary network outage as the client was not granted a durable handle. Tried the same scenario against windows server & the Durable handle is granted. Attached the packet traces for both windows server & samba server. I have tried a potential fix. Post that durable handle is granted for above scenario. In the below code, ideally for the second OPEN req we should be going to the below block and set state->got_oplock = true. But instead we are comparing e->op_type != LEASE_OPLOCK and this condition is not satisfied as fist open doesn't have any oplocks. By putting (e->op_type != LEASE_OPLOCK && e->op_type != NO_OPLOCK), issue seems to be fixed. static bool delay_for_oplock_fn( struct share_mode_entry *e, bool *modified, void *private_data) { .... if (!state->got_oplock && (e->op_type != LEASE_OPLOCK) && !share_entry_stale_pid(e)) { state->got_oplock = true; } .... } If this fix is sufficient, then i can create a PR.
(In reply to Sandeep Kanyal from comment #0) I fear it's more complex than that, because the share mode entries need to be sorted on the durable reconnect otherwise the binary search used to find them no longer works. https://bugzilla.samba.org/show_bug.cgi?id=15624 https://gitlab.com/samba-team/samba/-/merge_requests/3583 is also related to durable handle problems...
(In reply to Stefan Metzmacher from comment #1) Thanks Stefan. How easy would it be to fix it ? My application is failing coz of this bug :(
(In reply to Sandeep Kanyal from comment #2) Stefan, one question. I was debugging more for a 2nd bug (ill file that shortly) and i saw that we keep the entries in sorted order. But i vaguely remember that we also free the entries in case it doesn't have oplock. So in this case, during disconnect, wont the 1st open be removed from the smbXsrv_open_global.tdb as it doesn't have any oplocks ? (if i'm not wrong that the file where we keep the open files info)
(In reply to Sandeep Kanyal from comment #0) The problem is that we should ignore the first open which is a stat open and don't grant the Durable Handle on the second open plus RH lease instead of just R lease.
(In reply to Ralph Böhme from comment #4) I see. But then the application would always fail in case of disconnection. Actually i'm using a Microsoft application to write to samba share & it fails in case of network disruption (The server & client are cross geo). The problem is that it writes for 10+ hours (sometime 16+ hours) and then fails (Terabytes of data). And subsequent retries also fail as network is slow & patchy. If we can somehow support the durable handle for this scenario, that would be really great.
(In reply to Sandeep Kanyal from comment #5) I guess you misunderstood what I was trying to say, let me try to rephrase my statement: We likely should grant a DH for the second open and also grant a RH lease as requested by the client. The reason why the current code doesn't (wrongly) do this, is the first stat open.
(In reply to Ralph Böhme from comment #6) Perfect. Yea the POC fix that i did, worked. It granted a DH & a RWH lease and application was happy