Bug 9722 - Samba does not properly handle Oplock breaks in compound requests
Summary: Samba does not properly handle Oplock breaks in compound requests
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: File services (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-14 16:23 UTC by Richard Sharpe
Modified: 2013-05-13 07:29 UTC (History)
2 users (show)

See Also:


Attachments
A patch for the new test (4.99 KB, patch)
2013-03-14 16:25 UTC, Richard Sharpe
no flags Details
A slightly updated test (5.43 KB, patch)
2013-03-14 19:24 UTC, Richard Sharpe
no flags Details
A better patch that has seen more testing (995 bytes, patch)
2013-03-26 18:07 UTC, Richard Sharpe
no flags Details
Raw patch for master. (11.71 KB, patch)
2013-05-02 00:23 UTC, Jeremy Allison
no flags Details
git-am fix for master and 4.0.next. (11.71 KB, patch)
2013-05-02 22:15 UTC, Jeremy Allison
no flags Details
git-am fix for master and 4.0.next. (16.42 KB, patch)
2013-05-02 22:17 UTC, Jeremy Allison
no flags Details
git-am patchset for 4.0.x (17.18 KB, patch)
2013-05-07 19:18 UTC, Jeremy Allison
no flags Details
git-am patchset for 3.6.next (9.78 KB, patch)
2013-05-08 22:32 UTC, Jeremy Allison
rsharpe: review+
jra: review? (ira)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Sharpe 2013-03-14 16:23:52 UTC
This problem turned up when a customer was trying to use the ExtremeZ-IP AFP gateway.

In Samba 3.6.6 the smbd crashes. In 3.6.12 and above we send back an erroneous STATUS_CANCELLED.

Attached is an smbtorture test for this.
Comment 1 Richard Sharpe 2013-03-14 16:25:29 UTC
Created attachment 8646 [details]
A patch for the new test
Comment 2 Richard Sharpe 2013-03-14 16:25:44 UTC
I will submit a possible, but inadequate patch soon.
Comment 3 Richard Sharpe 2013-03-14 17:27:03 UTC
This patch fixes the problem as far as I can see, however, it does not fully mimic the behavior of Windows. Windows sends a STATUS_PENDING result, while this does not. Also, the line numbers might be off a bit:

--- samba-3.6.12.orig/source3/smbd/smb2_create.c 2012-06-24 10:21:16.000000000 -0700
+++ samba-3.6.12/source3/smbd/smb2_create.c      2013-03-14 01:00:44.575614048 -0700
@@ -244,7 +244,18 @@ NTSTATUS smbd_smb2_request_process_creat
        }
        tevent_req_set_callback(tsubreq, smbd_smb2_request_create_done, smb2req);

-       return smbd_smb2_request_pending_queue(smb2req, tsubreq);
+       /*
+        * If the open was deferred, then do not call request pending queue
+        * because that means an OpLock break has occurred and we want to
+        * wait.
+        */
+       if (smb2req->smb1req && open_was_deferred_smb2(smb2req->smb1req->sconn,
+                                               smb2req->smb1req->mid)) {
+               DEBUG(10, ("Deferred request (OpLock Break) ... doing nothing\n"));
+               return NT_STATUS_OK;
+       }
+       else
+               return smbd_smb2_request_pending_queue(smb2req, tsubreq);
 }
Comment 4 Richard Sharpe 2013-03-14 19:24:37 UTC
Created attachment 8647 [details]
A slightly updated test

This adds the test to the list of tests as well.
Comment 5 Richard Sharpe 2013-03-15 04:05:48 UTC
Currently, in smbd_smb2_request_pending_queue we do the following:

        if (req->in.vector_count > idx + SMBD_SMB2_NUM_IOV_PER_REQ) {
                /*
                 * We're trying to go async in a compound
                 * request chain. This is not allowed.
                 * Cancel the outstanding request.
                 */
                bool ok = tevent_req_cancel(req->subreq);
                if (ok) {
                        return NT_STATUS_OK;
                }
                TALLOC_FREE(req->subreq);
                return smbd_smb2_request_error(req,
                        NT_STATUS_INTERNAL_ERROR);
        }

However, if we have sent an OpLock Break, then we do want to go async and send a pending response if we want to do the same as Windows.

Perhaps the simplest way is to add a flag field to the request stating that we need to go async. Don't we have one of those already?
Comment 6 Richard Sharpe 2013-03-26 18:07:59 UTC
Created attachment 8684 [details]
A better patch that has seen more testing

This is a better patch that has seen more testing.

We do need to allow going async in a compound request if we want to handle this case. Also, as a result, only re-arrange the request if the command causing us to go async is not the first.
Comment 7 Richard Sharpe 2013-03-26 18:08:25 UTC
I should also have a better test soon as well. One that cleans up after itself.
Comment 8 Jeremy Allison 2013-04-30 22:28:08 UTC
Sorry for the delay Richard, I'm working on this now !

Jeremy.
Comment 9 Jeremy Allison 2013-05-02 00:23:57 UTC
Created attachment 8844 [details]
Raw patch for master.

Ok, just to show you where I'm going with this (it's a work in progress :-) here is a raw patch that contains both a slightly modified version of your torture test and a complete fix for master that allows it to pass the new smb2.compound-break test !

It breaks the interim2 test (returns NT_STATUS_CANCELLED - should be NT_STATUS_INTERNAL_ERROR) but it should show how I'm going to fix this.

I'll update further when I've gotten everything back working and broken into a set of micro-patches for master/4.0.x.

The nice thing is this should be pretty simple to back-port to 3.6.x, and addresses the issue fully as per the SMB2 spec + Windows implementation notes.

Cheers,

Jeremy.
Comment 10 Jeremy Allison 2013-05-02 18:34:47 UTC
Ah. The interim2 is a knownfail !

knownfail:^samba3.smb2.compound.interim2 # wrong return code (STATUS_CANCELLED)

That means the raw patch is actually correct...

I'll break it up into a set of micro-changes then submit to master.

Jeremy.
Comment 11 Jeremy Allison 2013-05-02 22:15:05 UTC
Created attachment 8846 [details]
git-am fix for master and 4.0.next.

Ok, I think this is what we need as a separate set of patches. It includes your torture test Richard (listed with you as the author). Once I've gotten everything passing tests I'll submit on the mailing list.

Jeremy.
Comment 12 Jeremy Allison 2013-05-02 22:16:30 UTC
Comment on attachment 8846 [details]
git-am fix for master and 4.0.next.

Bah, uploaded wrong file - sorry :-(.
Comment 13 Jeremy Allison 2013-05-02 22:17:18 UTC
Created attachment 8847 [details]
git-am fix for master and 4.0.next.

Ok, here is the real git-am patchset :-).

Jeremy.
Comment 14 Jeremy Allison 2013-05-07 19:18:32 UTC
Created attachment 8869 [details]
git-am patchset for 4.0.x

Code that was pushed to master (contains cherry-pick info).

Jeremy.
Comment 15 Jeremy Allison 2013-05-07 19:19:26 UTC
Re-assigning to Karolin for inclusion in 4.0.next.

(Patch already reviewed by 2 engineers, jra + realrichardsharpe).
Comment 16 Karolin Seeger 2013-05-08 19:27:02 UTC
Pushed to autobuild-v4-0-test.
Comment 17 Jeremy Allison 2013-05-08 19:41:58 UTC
Re-assigning to me to get the 3.6.next patchset done :-).
Comment 18 Jeremy Allison 2013-05-08 22:32:17 UTC
Created attachment 8873 [details]
git-am patchset for 3.6.next

Backport of the 4.0.x fix for 3.6.

Richard and Ira - please check !

Jeremy.
Comment 19 Richard Sharpe 2013-05-09 03:34:20 UTC
Comment on attachment 8873 [details]
git-am patchset for 3.6.next

Looks good to me.
Comment 20 Jeremy Allison 2013-05-09 16:22:32 UTC
Re-assigning back to Karolin so she can add the last patch to 3.6.next.

(Than we can finally close the thing :-).

Jeremy.
Comment 21 Karolin Seeger 2013-05-13 07:29:36 UTC
Pushed to v3-6-test.
Closing out bug report.

Thanks!