Bug 13770 - vfs_fruit incorrectly maps NetATalk share modes onto file locks.
vfs_fruit incorrectly maps NetATalk share modes onto file locks.
Status: RESOLVED FIXED
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.9.4
All All
: P5 normal
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-01-31 14:42 UTC by Andreas Schneider
Modified: 2019-02-22 08:31 UTC (History)
4 users (show)

See Also:


Attachments
strace output of failing test (314.75 KB, application/gzip)
2019-01-31 14:42 UTC, Andreas Schneider
no flags Details
fixes for master (4.60 KB, patch)
2019-02-01 14:38 UTC, Andreas Schneider
asn: review? (jra)
Details
git-am fix for master (12.62 KB, patch)
2019-02-07 02:07 UTC, Jeremy Allison
no flags Details
git-am fix for master (12.68 KB, patch)
2019-02-07 02:16 UTC, Jeremy Allison
slow: review+
Details
git-am fix for master (13.89 KB, patch)
2019-02-07 23:15 UTC, Jeremy Allison
jra: review? (slow)
asn: review+
Details
git-am fix for 4.10rcNext, 4.9.next, 4.8.next (19.19 KB, patch)
2019-02-11 19:43 UTC, Jeremy Allison
asn: review+
jra: review? (slow)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schneider 2019-01-31 14:42:24 UTC
Created attachment 14811 [details]
strace output of failing test

Samba assumes that it can just replace POSIX record locks with OFD locks. This assumption is wrong and that's the reason why

make -j8 test TESTS="samba3.vfs.fruit_netatalk.nt4"

fails when OFD locks are supported by the system.

OFD locks are different from POSIX locks in that if you set a lock on two different fds for the same file they'll conflict with one another, however that allows them to be used in threaded programs.

This is what happens in that test. We lock the file twice on two different file descriptors.
Comment 1 Jeremy Allison 2019-02-01 08:16:44 UTC
Renamed bug to something less "the sky is falling". On first inclination this may be an incorrect assumption in a test. I'll take a look at this.
Comment 2 Jeremy Allison 2019-02-01 08:42:31 UTC
Andreas, the strace output is not helpful. What *is* helpful on something like this is a debug level 10 log of smbd with the text output of the failing test.

A raw strace dump is too fine-grained to be a good starting point. Can you replace the -d0 with -d10 in the Samba3.pm provision and re-run the failing test and attach the output from the smbd log please ?

Jeremy.
Comment 3 Jeremy Allison 2019-02-01 09:38:01 UTC
Bringing Ralph into the discussion. Andreas, the issue is that when vfs_fruit is loaded and Samba is acting in netatalk compatibility mode, it wants to map open modes such as FILE_SHARE_READ|FILE_SHARE_WRITE etc. into byte range locks that are seen by netatalk processes. That's the long-lived persistent locks you're seeing in your strace logs once the file is opened.

It looks like the vfs_fruit code that does this doesn't (yet) understand these locks will be mapped into OFD locks rather than POSIX ones inside the VFS_LOCK layer (when calling do_lock() directly with a lock-type flag of POSIX_LOCK) so we will need some fixes here.

I don't currently see a generic problem with OFD locks in other parts of Samba (and I'm assuming all the other locking tests pass with OFD locks selected, correct ?) so this hopefully is an isolated issues.
Comment 4 Andreas Schneider 2019-02-01 14:38:04 UTC
Created attachment 14818 [details]
fixes for master

Yes, I'm only seeing the vfs_fruit test failing.

If you want a debug level 10 output, just run the test:

source selftest/devel_env.sh
make -j8 test TESTS="samba3.vfs.fruit_netatalk.nt4" SMBD_OPTIONS="-d10"

I'm travelling to Brussels for FOSDEM, I can't upload big files via mobile phone. 

I think the two fixes in the attached patch are still valid. I appreciate if you look into the vfs_fruit module.
Comment 5 Jeremy Allison 2019-02-06 22:51:30 UTC
Dug deep into this - the good news is the test itself is inherently broken, due to a bug in the server code inside vfs_fruit :-).

Inside vfs_fruit.c we have:

flags = fcntl(fsp->fh->fd, F_GETFL);
if (flags == -1) {
        DBG_ERR("fcntl get flags [%s] fd [%d] failed [%s]\n",
                fsp_str_dbg(fsp), fsp->fh->fd, strerror(errno));
        return map_nt_error_from_unix(errno);
}

if (flags & (O_RDONLY|O_RDWR)) {
        /*
         * Applying fcntl read locks requires an fd opened for
         * reading. This means we won't be applying locks for
         * files openend write-only, but what can we do...
         */
         have_read = true;
}

Firstly, we shouldn't be calling fcntl(fsp->fh->fd, ..) directly. fsp->fh->fd may be a made up number from an underlying VFS module that has no meaning to a system call. This whole thing should be replaced with:

bool have_read = (access_mask & FILE_READ_DATA);

Secondly, in all POSIX systems - O_RDONLY is defined as *zero*. O_RDWR = 2.

Which means flags & (O_RDONLY|O_RDWR) becomes (flags & 2), not what we actually thought.

Still looking at this, but it looks like both the server and test code need fixing here.
Comment 6 Jeremy Allison 2019-02-07 00:31:57 UTC
Oh god - the vfs_fruit code misunderstands the definitions of deny_mode also.

deny_mode is *not* a bitmask, it's a set of discrete values. vfs_fruit has:

if (deny_mode & DENY_READ) and also (deny_mode & DENY_WRITE)

Deny modes are defined as:

/* deny modes */
#define DENY_DOS 0
#define DENY_ALL 1
#define DENY_WRITE 2
#define DENY_READ 3
#define DENY_NONE 4
#define DENY_FCB 7

so if deny_mode = DENY_WRITE, or if deny_mode = DENY_READ then it's going to trigger both the if (deny_mode & DENY_READ) *and* the (deny_mode & DENY_WRITE) conditions.
Comment 7 Jeremy Allison 2019-02-07 02:07:50 UTC
Created attachment 14833 [details]
git-am fix for master

Here is a git-am fix I think is correct for master. We can evaluate Andreas's changes separately I think.
Comment 8 Jeremy Allison 2019-02-07 02:16:36 UTC
Created attachment 14834 [details]
git-am fix for master

Slightly better variable names :-).
Comment 9 Ralph Böhme 2019-02-07 09:08:24 UTC
Comment on attachment 14834 [details]
git-am fix for master

Lgtm! Thanks again! Shall I push?
Comment 10 Jeremy Allison 2019-02-07 23:15:55 UTC
Created attachment 14837 [details]
git-am fix for master

samba3.vfs.fruit_netatalk.nt4 test code needed minor tweak to pass with either POSIX locks or OFD locks.

As we were simulating a NetAtalk process owning a deny-mode and read lock by setting an SMB2Lock to the same smbd process, when running with OFD locks the OS-level VFS_GETLOCK notices the lock is on a different file descriptor and so we return NT_STATUS_SHARING_VIOLATION by getting 'true' back from test_netatalk_lock().

When running with POSIX locks the lock on a different fd doesn't conflict (that's an OS-level VFS_GETLOCK call on the file descriptor) and so  test_netatalk_lock() returns false. Thus we only notice the lock when we try and apply the read lock over the existing write lock, and get NT_STATUS_FILE_LOCK_CONFLICT from the internal NT locking layer instead.

Correct fix is to change samba3.vfs.fruit_netatalk.nt4 to use torture_suite_add_2smb2_test() which uses two connections. That way when we try to open the second file handle on the second connection it detects the existing OS-level file lock whether it be POSIX or OFD locks, as it's coming from a different process - so we always return NT_STATUS_SHARING_VIOLATION.

That is actually a more correct test as a real NetATalk process will always be a separate process to smbd, so we're now emulating an existing NetATalk process in a more correct way.
Comment 11 Andreas Schneider 2019-02-08 15:45:39 UTC
Comment on attachment 14837 [details]
git-am fix for master

LGTM
Comment 12 Jeremy Allison 2019-02-11 19:43:12 UTC
Created attachment 14839 [details]
git-am fix for 4.10rcNext, 4.9.next, 4.8.next

Back-ported from master & BUG: references added. Applies cleanly to 4.10rcNext, 4.9.next, 4.8.next.
Comment 13 Andreas Schneider 2019-02-12 09:43:46 UTC
Karolin, please add the patchset to the relevant branches. Thanks!
Comment 14 Karolin Seeger 2019-02-21 11:21:53 UTC
(In reply to Andreas Schneider from comment #13)
Pushed to autobuild-v4-{10,9,8}-test.
Comment 15 Karolin Seeger 2019-02-22 08:31:04 UTC
(In reply to Karolin Seeger from comment #14)
Pushed to all branches.
Closing out bug report.

Thanks!