Bug 10431 - STATUS_NO_MEMORY response from Query File Posix Lock request
Summary: STATUS_NO_MEMORY response from Query File Posix Lock request
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.1.3
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-07 18:59 UTC by Jeff Layton
Modified: 2014-03-25 09:30 UTC (History)
1 user (show)

See Also:


Attachments
capture showing STATUS_NO_MEMORY response (6.35 KB, application/cap)
2014-02-07 18:59 UTC, Jeff Layton
no flags Details
smbd -d10 -i log while reproducing problem (582.70 KB, text/plain)
2014-02-07 19:02 UTC, Jeff Layton
no flags Details
tlock.c from cthon test suite (38.47 KB, text/plain)
2014-02-07 23:17 UTC, Jeff Layton
no flags Details
brlock debug patch (899 bytes, patch)
2014-02-12 18:52 UTC, Jeff Layton
no flags Details
Fix for master. (1.82 KB, patch)
2014-02-27 00:36 UTC, Jeremy Allison
no flags Details
git-am fix for 4.1.next and 4.0.next. (3.09 KB, patch)
2014-02-28 00:41 UTC, Jeremy Allison
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Layton 2014-02-07 18:59:24 UTC
Created attachment 9664 [details]
capture showing STATUS_NO_MEMORY response

I'm getting the following error when running connectathon lock test 10 on a cifs mount vs. samba 4.1.3:

$  ./tlocklfs -t 10 /mnt/cifs/rawhide.test/
Creating parent/child synchronization pipes.

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] PASSED.
	Child:  10.5  - F_TEST  [               1,               1] PASSED.
	Parent: 10.6  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.7  - F_ULOCK [               2,               1] PASSED.
	Child:  10.8  - F_TEST  [               0,               3] FAILED!
	Child:  **** Expected success, returned errno=121...
	Child:  **** Probably implementation error.


Error 121 is -EREMOTEIO, which we're returning because the server sent STATUS_NO_MEMORY in response to the request. Capture attached. I'll do a -d10 log in a bit.
Comment 1 Jeff Layton 2014-02-07 19:02:50 UTC
Created attachment 9665 [details]
smbd -d10 -i log while reproducing problem
Comment 2 Jeff Layton 2014-02-07 19:06:13 UTC
Hrm:

Could not fetch byte range lock record
NT error packet at ../source3/smbd/trans2.c(5589) cmd=50 (SMBtrans2) NT_STATUS_NO_MEMORY

...no indication as to why though, and some earlier queries went through just fine.
Comment 3 Jeremy Allison 2014-02-07 21:00:37 UTC
Is this repeatable ?
Comment 4 Jeff Layton 2014-02-07 21:17:57 UTC
Yep. Happens every time...
Comment 5 Jeremy Allison 2014-02-07 21:29:03 UTC
Can you add some debugging here:

NTSTATUS query_lock(files_struct *fsp,
                        uint64_t *psmblctx,
                        uint64_t *pcount,
                        uint64_t *poffset,
                        enum brl_type *plock_type,
                        enum brl_flavour lock_flav)
{
        struct byte_range_lock *br_lck = NULL;

        if (!fsp->can_lock) {
                return fsp->is_directory ? NT_STATUS_INVALID_DEVICE_REQUEST : NT_STATUS_INVALID_HANDLE;
        }

        if (!lp_locking(fsp->conn->params)) {
                return NT_STATUS_OK;
        }

        br_lck = brl_get_locks_readonly(fsp);
        if (!br_lck) {
                return NT_STATUS_NO_MEMORY;
        }

        return brl_lockquery(br_lck,
                        psmblctx,
                        messaging_server_id(fsp->conn->sconn->msg_ctx),
                        poffset,
                        pcount,
                        plock_type,
                        lock_flav);
}

I'm guessing this is where the NT_STATUS_NO_MEMORY is coming from, but we'll need to trace through to be sure.

Can you give me a binary I can use to reproduce ?

Jeremy.
Comment 6 Jeff Layton 2014-02-07 23:17:18 UTC
Created attachment 9666 [details]
tlock.c from cthon test suite

I'll try the debugging when I get a chance.

Here's the source for the test program. It's just "tlock.c" from the connectathon testsuite. Compile it with:

    $ gcc  -DLINUX -DGLIBC=22 -DMMAP -DSTDARG -DLF_SUMMIT -o tlocklfs tlock.c -lm

Then mount up a writeable cifs share with a semi-recent kernel:

    # mount -t cifs //server/share /mnt/cifs

and just run this to reproduce:

    $ ./tlocklfs -t 10 /mnt/cifs
Comment 7 Jeff Layton 2014-02-07 23:19:43 UTC
...oh and you might need some mount options for the cifs mount of course, depending on your setup.

Actually, I'm not quite sure what sort of debugging you want there since I'm not that familiar with this code. Mind rolling up a debug patch that I can just apply and collect a log for you to look at?
Comment 8 Jeff Layton 2014-02-10 13:37:03 UTC
Hmmm. Based on the "Could not fetch byte range lock record" log message (without adding any extra debugging), I think the error comes from here in brl_get_locks_internal:

        if (do_read_only) {
                NTSTATUS status;
                status = dbwrap_fetch(brlock_db, br_lck, key, &data);
                if (!NT_STATUS_IS_OK(status)) {
                        DEBUG(3, ("Could not fetch byte range lock record\n"));
                        TALLOC_FREE(br_lck);
                        return NULL;
                }
                br_lck->record = NULL;
        } else {


...so I guess dbwrap_fetch returned an error? I'll see if I can figure out what that error is when a get a chance to re-test.
Comment 9 Jeff Layton 2014-02-12 18:52:07 UTC
Created attachment 9677 [details]
brlock debug patch

brlock debug patch. I applied this, rebuilt samba and reproduced the problem. The -d10 log says:

    Could not fetch byte range lock record: -1073741275

...hrm, in hindsight I probably should have made it print in hex and unsigned...
Comment 10 Jeff Layton 2014-02-12 18:57:09 UTC
Ok, that's:

    0xc0000225

...which is:

    #define NT_STATUS_NOT_FOUND NT_STATUS(0xC0000000 | 0x0225)

...at this point, I'm beyond my meager samba chops. Jeremy, any thoughts?
Comment 11 Jeremy Allison 2014-02-12 18:58:51 UTC
0xc0000225 == #define NT_STATUS_NOT_FOUND NT_STATUS(0xC0000000 | 0x0225)
Comment 12 Jeremy Allison 2014-02-20 20:10:37 UTC
FYI. I think the smbd server in master is broken right now. I'm also getting this from smb2.notify tests.

I'll investigate...

Jeremy.
Comment 13 Jeremy Allison 2014-02-26 22:18:43 UTC
OK, reproduced this (finally). Will update with my progress..
Comment 14 Jeremy Allison 2014-02-26 23:54:50 UTC
Ok, it's a change in the semantics of brl_get_locks_readonly(). Fix to follow..
Comment 15 Jeremy Allison 2014-02-27 00:36:34 UTC
Created attachment 9728 [details]
Fix for master.

Jeff, here's the fix for master. Once it gets in I'll back port to 4.1.next, 4.0.next.

Cheers,

Jeremy.
Comment 16 Jeremy Allison 2014-02-28 00:41:24 UTC
Created attachment 9735 [details]
git-am fix for 4.1.next and 4.0.next.

Note this is a *back-port* of the patch that went into master.

Volker, please re-review this fix. It passes the tlocklfs POSIX tests here for both 4.1.x and 4.0.next.

Thanks,

Jeremy.
Comment 17 Jeremy Allison 2014-02-28 00:41:59 UTC
Comment on attachment 9728 [details]
Fix for master.

Was already reviewed by Volker and merged into master.
Comment 18 Jeremy Allison 2014-02-28 17:36:31 UTC
Karolin please push to 4.1.next and 4.0.next.

Volker, I have an idea for a torture test for this I'll try and get to and put into smbtorture, but I'll do that as an additional one-off patch to master on the samba-technical mailing list - that shouldn't block the bug being closed out.

Cheers,

Jeremy.
Comment 19 Karolin Seeger 2014-03-10 15:32:49 UTC
(In reply to comment #18)
> Karolin please push to 4.1.next and 4.0.next.
> 
> Volker, I have an idea for a torture test for this I'll try and get to and put
> into smbtorture, but I'll do that as an additional one-off patch to master on
> the samba-technical mailing list - that shouldn't block the bug being closed
> out.
> 
> Cheers,
> 
> Jeremy.

Pushed to autobuild-v4-1-test and autobuild-v4-0-test.
Comment 20 Karolin Seeger 2014-03-25 09:30:52 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Karolin please push to 4.1.next and 4.0.next.
> > 
> > Volker, I have an idea for a torture test for this I'll try and get to and put
> > into smbtorture, but I'll do that as an additional one-off patch to master on
> > the samba-technical mailing list - that shouldn't block the bug being closed
> > out.
> > 
> > Cheers,
> > 
> > Jeremy.
> 
> Pushed to autobuild-v4-1-test and autobuild-v4-0-test.

Pushed to v4-1-test and v4-0-test.
Closing out bug report.

Thanks!