Created attachment 9664 [details] capture showing STATUS_NO_MEMORY response I'm getting the following error when running connectathon lock test 10 on a cifs mount vs. samba 4.1.3: $ ./tlocklfs -t 10 /mnt/cifs/rawhide.test/ Creating parent/child synchronization pipes. Test #10 - Make sure a locked region is split properly. Parent: 10.0 - F_TLOCK [ 0, 3] PASSED. Parent: 10.1 - F_ULOCK [ 1, 1] PASSED. Child: 10.2 - F_TEST [ 0, 1] PASSED. Child: 10.3 - F_TEST [ 2, 1] PASSED. Child: 10.4 - F_TEST [ 3, ENDING] PASSED. Child: 10.5 - F_TEST [ 1, 1] PASSED. Parent: 10.6 - F_ULOCK [ 0, 1] PASSED. Parent: 10.7 - F_ULOCK [ 2, 1] PASSED. Child: 10.8 - F_TEST [ 0, 3] FAILED! Child: **** Expected success, returned errno=121... Child: **** Probably implementation error. Error 121 is -EREMOTEIO, which we're returning because the server sent STATUS_NO_MEMORY in response to the request. Capture attached. I'll do a -d10 log in a bit.
Created attachment 9665 [details] smbd -d10 -i log while reproducing problem
Hrm: Could not fetch byte range lock record NT error packet at ../source3/smbd/trans2.c(5589) cmd=50 (SMBtrans2) NT_STATUS_NO_MEMORY ...no indication as to why though, and some earlier queries went through just fine.
Is this repeatable ?
Yep. Happens every time...
Can you add some debugging here: NTSTATUS query_lock(files_struct *fsp, uint64_t *psmblctx, uint64_t *pcount, uint64_t *poffset, enum brl_type *plock_type, enum brl_flavour lock_flav) { struct byte_range_lock *br_lck = NULL; if (!fsp->can_lock) { return fsp->is_directory ? NT_STATUS_INVALID_DEVICE_REQUEST : NT_STATUS_INVALID_HANDLE; } if (!lp_locking(fsp->conn->params)) { return NT_STATUS_OK; } br_lck = brl_get_locks_readonly(fsp); if (!br_lck) { return NT_STATUS_NO_MEMORY; } return brl_lockquery(br_lck, psmblctx, messaging_server_id(fsp->conn->sconn->msg_ctx), poffset, pcount, plock_type, lock_flav); } I'm guessing this is where the NT_STATUS_NO_MEMORY is coming from, but we'll need to trace through to be sure. Can you give me a binary I can use to reproduce ? Jeremy.
Created attachment 9666 [details] tlock.c from cthon test suite I'll try the debugging when I get a chance. Here's the source for the test program. It's just "tlock.c" from the connectathon testsuite. Compile it with: $ gcc -DLINUX -DGLIBC=22 -DMMAP -DSTDARG -DLF_SUMMIT -o tlocklfs tlock.c -lm Then mount up a writeable cifs share with a semi-recent kernel: # mount -t cifs //server/share /mnt/cifs and just run this to reproduce: $ ./tlocklfs -t 10 /mnt/cifs
...oh and you might need some mount options for the cifs mount of course, depending on your setup. Actually, I'm not quite sure what sort of debugging you want there since I'm not that familiar with this code. Mind rolling up a debug patch that I can just apply and collect a log for you to look at?
Hmmm. Based on the "Could not fetch byte range lock record" log message (without adding any extra debugging), I think the error comes from here in brl_get_locks_internal: if (do_read_only) { NTSTATUS status; status = dbwrap_fetch(brlock_db, br_lck, key, &data); if (!NT_STATUS_IS_OK(status)) { DEBUG(3, ("Could not fetch byte range lock record\n")); TALLOC_FREE(br_lck); return NULL; } br_lck->record = NULL; } else { ...so I guess dbwrap_fetch returned an error? I'll see if I can figure out what that error is when a get a chance to re-test.
Created attachment 9677 [details] brlock debug patch brlock debug patch. I applied this, rebuilt samba and reproduced the problem. The -d10 log says: Could not fetch byte range lock record: -1073741275 ...hrm, in hindsight I probably should have made it print in hex and unsigned...
Ok, that's: 0xc0000225 ...which is: #define NT_STATUS_NOT_FOUND NT_STATUS(0xC0000000 | 0x0225) ...at this point, I'm beyond my meager samba chops. Jeremy, any thoughts?
0xc0000225 == #define NT_STATUS_NOT_FOUND NT_STATUS(0xC0000000 | 0x0225)
FYI. I think the smbd server in master is broken right now. I'm also getting this from smb2.notify tests. I'll investigate... Jeremy.
OK, reproduced this (finally). Will update with my progress..
Ok, it's a change in the semantics of brl_get_locks_readonly(). Fix to follow..
Created attachment 9728 [details] Fix for master. Jeff, here's the fix for master. Once it gets in I'll back port to 4.1.next, 4.0.next. Cheers, Jeremy.
Created attachment 9735 [details] git-am fix for 4.1.next and 4.0.next. Note this is a *back-port* of the patch that went into master. Volker, please re-review this fix. It passes the tlocklfs POSIX tests here for both 4.1.x and 4.0.next. Thanks, Jeremy.
Comment on attachment 9728 [details] Fix for master. Was already reviewed by Volker and merged into master.
Karolin please push to 4.1.next and 4.0.next. Volker, I have an idea for a torture test for this I'll try and get to and put into smbtorture, but I'll do that as an additional one-off patch to master on the samba-technical mailing list - that shouldn't block the bug being closed out. Cheers, Jeremy.
(In reply to comment #18) > Karolin please push to 4.1.next and 4.0.next. > > Volker, I have an idea for a torture test for this I'll try and get to and put > into smbtorture, but I'll do that as an additional one-off patch to master on > the samba-technical mailing list - that shouldn't block the bug being closed > out. > > Cheers, > > Jeremy. Pushed to autobuild-v4-1-test and autobuild-v4-0-test.
(In reply to comment #19) > (In reply to comment #18) > > Karolin please push to 4.1.next and 4.0.next. > > > > Volker, I have an idea for a torture test for this I'll try and get to and put > > into smbtorture, but I'll do that as an additional one-off patch to master on > > the samba-technical mailing list - that shouldn't block the bug being closed > > out. > > > > Cheers, > > > > Jeremy. > > Pushed to autobuild-v4-1-test and autobuild-v4-0-test. Pushed to v4-1-test and v4-0-test. Closing out bug report. Thanks!