Bug 13979 (CVE-2021-43566) - CVE-2021-43566 [SECURITY] mkdir race condition allows share escape in Samba 4.x
Summary: CVE-2021-43566 [SECURITY] mkdir race condition allows share escape in Samba 4.x
Status: RESOLVED FIXED
Alias: CVE-2021-43566
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.10.4
Hardware: All All
: P5 critical (vote)
Target Milestone: ---
Assignee: Jeremy Allison
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on: 14842
Blocks:
  Show dependency treegraph
 
Reported: 2019-05-30 00:30 UTC by Andrew Bartlett
Modified: 2022-12-16 12:07 UTC (History)
19 users (show)

See Also:


Attachments
detail from security researcher (3.11 KB, text/plain)
2019-05-30 00:30 UTC, Andrew Bartlett
no flags Details
git am for master - prototype fix for mkdir (7.49 KB, patch)
2019-05-31 22:38 UTC, Jeremy Allison
no flags Details
Race reproduction script from June 2, 2019 (1.18 KB, text/plain)
2019-06-02 19:42 UTC, hansmi
no flags Details
Minor patch to address use-after free in May 31, 2019 patch from Jeremy (615 bytes, patch)
2019-06-03 21:51 UTC, hansmi
no flags Details
Updated git-am fix for master (8.64 KB, patch)
2019-06-04 01:17 UTC, Jeremy Allison
no flags Details
git-am fix for master v3 (10.29 KB, patch)
2019-06-04 03:14 UTC, Jeremy Allison
no flags Details
Patch for 4.13.next. (2.64 KB, patch)
2021-09-23 18:13 UTC, Jeremy Allison
slow: review+
Details
CVE advisory. (3.35 KB, text/plain)
2021-09-24 18:51 UTC, Jeremy Allison
slow: review+
jmcd: review+
Details
Updated CVE text . (3.35 KB, text/plain)
2021-11-10 17:25 UTC, Jeremy Allison
slow: review+
Details
Updated for for 4.13.next. (2.64 KB, patch)
2021-11-10 17:27 UTC, Jeremy Allison
slow: review+
slow: ci-passed+
Details
Updated CVE text (3.35 KB, text/plain)
2022-01-07 14:32 UTC, Ralph Böhme
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Jeremy Allison 2019-05-30 01:19:56 UTC
Ah, I see the problem. The non_widelink_open() code that
protects us from this in the open/creat code path isn't
being used in the mkdir case.

Rename will suffer from the same problem for the destination
path specifier. I'll try and fix both cases for this bug.

Ultimately we need to modify the VFS to use the syscallAT()
varients of all the system calls, but that's a VFS rewrite
we will have to schedule for another day.
Comment 2 Andrew Bartlett 2019-05-30 02:45:08 UTC
Michael,

Can you please confirm you are happy to be identified as the reporter of this issue (in the same was as your previous reports)?

Thanks!
Comment 3 Jeremy Allison 2019-05-30 03:19:37 UTC
OK, I think I can see a way to fix this (on Linux
at least) without breaking the ABI.

It's really ugly though, but will at least fix
the problems with the syscalls that take
2 pathnames (rename and link are the two I
can think of).
Comment 4 Jeremy Allison 2019-05-30 17:09:59 UTC
I'm planning on creating a patchset that will fix uses of

mkdir
rename
symlink
mknod

which can be done in the same way as we handled the previous 'open' case (chdir).

rename
link

are harder to fix, as the method used in the open case won't fully work here as they take 2 pathnames, both of which can by symlink raced.

To fix them, I'm planning to fix the 'src' component by doing the chdir method we use for open, and then inside vfs_default.c add code in the default rename/link syscall layer to open a temporary file handle to the parent directory of the target (using open(name, O_DIRECTORY|O_RDONLY, 0) - which is guaranteed not to conflict with locks (for the POSIX lock case) or leases, and then use the syscall variants of renameat(AT_CWD, src, fd, dst), linkat(AT_CWD, src, fd, dst) to implement these safely. This way we keep the VFS ABI and can easily back-port these changes.

Going forward we *really* need to remove the SMB1 server code, which will prevent any client instigated symlink races (only SMB1 extensions creates server-side symlinks) and modify the VFS to use the XXXat() syscal versions.

Sound like a plan ?
Comment 5 hansmi 2019-05-30 21:42:03 UTC
> Can you please confirm you are happy to be identified as the reporter of this issue (in the same was as your previous reports)?

Yes, I am happy with that.

Jeremy, I agree that the *at syscalls appear to be the most appropriate way where available.
Comment 6 Jeremy Allison 2019-05-31 22:38:55 UTC
Created attachment 15212 [details]
git am for master - prototype fix for mkdir

OK Michael, can you try to reproduce with this patch added.

I was able to reproduce your race without it, but once I added this patch I can't reproduce the issue anymore.

It adds wrapper functions:

safe_pathname_start()
safe_pthname_end()

which I should be able to use to fix mkdir, symlink, mknod.

As I said, I need to do more work for symlink and link, but I think this shows how it can be done.
Comment 7 Jeremy Allison 2019-05-31 22:58:06 UTC
Oh thank goodness, we still pass TESTS=raw.mkdir and TESTS=samba3.blackbox.shadow_copy2 with this patchset :-).

At SambaXP I'll see about making sure it passes on a private gitlab or catalyst CI run.
Comment 8 Jeremy Allison 2019-05-31 23:27:08 UTC
Ah, hang on a minute. I am getting some crashes with this patch. Investigating..
Comment 9 Jeremy Allison 2019-06-01 05:04:51 UTC
Ah, OK - I think the crashes are the correct thing here.

What is happening is the python thread itself has a race where the "dest" directory sometimes doesn't exist:

    if os.path.islink(dest):
      os.unlink(dest)
    elif os.path.isdir(dest):
      shutil.rmtree(dest)

<---- Here "dest" doesn't exist for a time

    os.mkdir(dest)
    ev.set()
    os.rename(dest, fordel)

<---- Here "dest" doesn't exist for a time

    os.symlink("/tmp", dest)

Wht can happen is that we chdir() into "dest" (storing conn->cwd_fname = "/srv/data/dest") which exists, start doing the operation and then after the mkdir call succeeds call become_root()/unbecome_root() as part of the created directory post processing.

The unbecome_root() code then tries to chdir("/srv/data/dest") on popping the connection context stack (after all, we just created /srv/data/dest/shouldnotexist.timestamp, right !) and at this point "src/data/dest" doesn't exist (it's been deleted or moved etc.).

pop_conn_ctx() treats this as a panic error - see below:

       /*
         * Check if the current context did a chdir_current_service()
         * and restore the cwd_fname of the previous context
         * if needed.
         */
        if (current_user.done_chdir && ctx_p->need_chdir) {
                int ret;

                ret = vfs_ChDir(ctx_p->conn, ctx_p->conn->cwd_fname);
                if (ret != 0) {
                        DBG_ERR("vfs_ChDir() failed!\n");
                        smb_panic("vfs_ChDir() failed!\n");
                }
        }

which IMHO is a reasonable decision. If the chdir() back to where we were we fails we don't know where we are in the filesystem and can't continue to process calls.

Looks like a self-inflicted wound at this point and it's reasonable to panic.

So I think my fix is good for the mkdir case (even though it causes an smbd crash), smbd is simply not designed for this kind of simultaneous manipulation of the tree by the same user.
Comment 10 Jeremy Allison 2019-06-01 05:12:53 UTC
Yep, confirmed my suspicions by adding extra debug in the vfs_ChDir code:

[2019/05/31 22:10:21.639048,  0] ../../source3/smbd/vfs.c:811(vfs_ChDir)
  vfs_ChDir: vfs_ChDir to /srv/data/dest No such file or directory
[2019/05/31 22:10:21.639898,  0] ../../source3/smbd/uid.c:586(pop_conn_ctx)
  pop_conn_ctx: vfs_ChDir() failed!
[2019/05/31 22:10:21.639928,  0] ../../source3/lib/util.c:824(smb_panic_s3)
  PANIC (pid 5224): vfs_ChDir() failed!

As I said above, we're trying to go back to a directory that doesn't exist currently (but did when we chdir()'ed into it originally).
Comment 11 Jeremy Allison 2019-06-01 05:15:20 UTC
It's the lookup_sid code that's wanting the become_root()/unbecome_root() calls as part of the SMB_VFS_GET_NT_ACL() call to see if we should be able to access the directory to create.

0x00007ff492f0526e in smb_panic (why=0x7ff492bf5de2 "vfs_ChDir() failed!\n") at ../../lib/util/fault.c:170
#4  0x00007ff492a0d3ec in pop_conn_ctx () at ../../source3/smbd/uid.c:587
#5  0x00007ff492a0d488 in smbd_unbecome_root () at ../../source3/smbd/uid.c:624
#6  0x00007ff48f681d75 in unbecome_root () at ../../source3/lib/smbd_shim.c:115
#7  0x00007ff49051242d in xid_to_sid (psid=0x7fffdae9c430, xid=0x7fffdae9c2f0) at ../../source3/passdb/lookup_sid.c:1202
#8  0x00007ff4905126a8 in gid_to_sid (psid=0x7fffdae9c430, gid=1000) at ../../source3/passdb/lookup_sid.c:1242
#9  0x00007ff492a37bd2 in create_file_sids (psbuf=0x559c4ba2f050, powner_sid=0x7fffdae9c3e0, pgroup_sid=0x7fffdae9c430) at ../../source3/smbd/posix_acls.c:921
#10 0x00007ff492a3ea12 in posix_get_nt_acl_common (conn=0x559c4ba6eb10, name=0x559c4ba2f150 ".", sbuf=0x559c4ba2f050, pal=0x0, posix_acl=0x559c4ba77f60, 
    def_acl=0x0, security_info=4, mem_ctx=0x559c4ba777b0, ppdesc=0x7fffdae9c600) at ../../source3/smbd/posix_acls.c:3311
#11 0x00007ff492a3f5aa in posix_get_nt_acl (conn=0x559c4ba6eb10, smb_fname_in=0x559c4ba79f20, security_info=4, mem_ctx=0x559c4ba777b0, ppdesc=0x7fffdae9c600)
    at ../../source3/smbd/posix_acls.c:3555
#12 0x00007ff492ba558f in vfswrap_get_nt_acl (handle=0x559c4ba787d0, smb_fname=0x559c4ba79f20, security_info=4, mem_ctx=0x559c4ba777b0, ppdesc=0x7fffdae9c600)
    at ../../source3/modules/vfs_default.c:2881
#13 0x00007ff492a32463 in smb_vfs_call_get_nt_acl (handle=0x559c4ba787d0, smb_fname=0x559c4ba79f20, security_info=4, mem_ctx=0x559c4ba777b0, ppdesc=0x7fffdae9c600)
    at ../../source3/smbd/vfs.c:2563
#14 0x00007ff492a15b5f in check_parent_access (conn=0x559c4ba6eb10, smb_fname=0x559c4ba2f310, access_mask=4) at ../../source3/smbd/open.c:291
#15 0x00007ff492a1f124 in mkdir_internal (conn=0x559c4ba6eb10, smb_dname=0x559c4ba6c9c0, file_attributes=16) at ../../source3/smbd/open.c:4050
Comment 12 Jeremy Allison 2019-06-01 05:23:25 UTC
I'm going to leave this bug until I get to SambaXP to discuss with other Samba smbd engineers to see if they agree with my assessment of it being an acceptable risk. I think the crash is acceptable in this case, as it would happen in the current "open()" code path (which does the same thing). We really don't like our $cwd being deleted from out under us :-).
Comment 13 Jeremy Allison 2019-06-02 00:57:12 UTC
OK, I've thought of a way around the crash inside pop_conn_ctx() when called by unbecome_root(). It depends on become_root()/unbecome_root() pairs only wrapping code that saves/restores any directory changes before returning.

Which luckily I think all of them do. Still looking at the usages.
Comment 14 hansmi 2019-06-02 19:42:01 UTC
Created attachment 15213 [details]
Race reproduction script from June 2, 2019

> OK Michael, can you try to reproduce with this patch added.

Unfortunately the underlying issue doesn't appear to be fully resolved. I've tweaked my script slightly to be more robust (see attachment) and ran it for a while in a loop to account for terminated connections. The loop was easier than making the script even more robust.

---
while sleep .1; do timeout 2s ../race.py ; done
---

Result after a while:

---
# ls -lhd /tmp/shouldnotexist.*
drwxr-xr-x 2 johndoe johndoe 40 Jun  2 19:37 /tmp/shouldnotexist.1559504248.556245
drwxr-xr-x 2 johndoe johndoe 40 Jun  2 19:38 /tmp/shouldnotexist.1559504286.1945086
drwxr-xr-x 2 johndoe johndoe 40 Jun  2 19:39 /tmp/shouldnotexist.1559504368.2661018
---
Comment 15 Jeremy Allison 2019-06-03 17:05:08 UTC
Hmmm. Can't currently see how that can be happening once we've chdir()'ed into the target directory, checked it's under the share export and then do a "mkdir("./shouldnotexist.<timestamp>") only from within the chdir()'ed directory.

We are no longer doing mkdir() calls with anything but a single element path from inside the current directory.

Being inside the current $cwd should prevent a rename and symlink of the path above from having any effect.

Can you think of any way this can occur ?
Comment 16 Jeremy Allison 2019-06-03 17:07:31 UTC
I can add some test code into smbd that checks inside the mkdir syscall wrapper for the string "shouldnotexist", and if it finds it, checks for any "/" component in the given filename and calls abort() if found.

That would confirm we're never doing a multi-path mkdir call.

If we still get escapes after that, isn't it a kernel bug ?
Comment 17 hansmi 2019-06-03 21:51:09 UTC
Created attachment 15214 [details]
Minor patch to address use-after free in May 31, 2019 patch from Jeremy
Comment 18 hansmi 2019-06-03 22:16:41 UTC
> I can add some test code into smbd that checks inside the mkdir syscall wrapper for the string "shouldnotexist", and if it finds it, checks for any "/" component in the given filename and calls abort() if found.

I did just that and can confirm that mkdir is never called with "/" in its path argument. I'm still investigating. Unfortunately running smbd through strace appears to modify timing such that the race no longer occurs.
Comment 19 hansmi 2019-06-03 22:22:24 UTC
And as I was writing my previous comment the reproduction was successful:

22:16:36.567227 chdir("/srv/data/dest") = 0 <0.000038>
22:16:36.567373 stat(".", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=800, ...}) = 0 <0.000030>
22:16:36.567502 stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=800, ...}) = 0 <0.000038>
22:16:36.567806 mkdir("shouldnotexist.1559600196.1234279", 0755) = 0 <0.000119>

So what happens is that chdir(2) itself is redirected via the symlink. I've seen other programs essentially walking the whole path from "/" via fchdir(2), though that seems quite excessive.

Could we rely on getcwd(3) and friends to check whether chdir(2) ended up in the correct place?

Alternatively open(2) with flags containing O_DIRECTORY and O_NOFOLLOW followed by fchdir(2) to the returned file descriptor.
Comment 20 Jeremy Allison 2019-06-04 00:32:39 UTC
Don't worry I have found the race in our code. Figured it out trying to get to sleep on the plane :-), so had to buy wifi to report this.

We should not be calling the chdir() just before the mkdir(), the reason we are is a bug in unbecome_root() which is triggering a chdir() which should be skipped.

I actually have a fix for this on my laptop, but will have to wait until I'm on the ground with full connectivity again to upload.

Hang tight, I think my fix is good, just need to eliminate this additional race which I missed.
Comment 21 Jeremy Allison 2019-06-04 00:45:35 UTC
So in my initial patch I wrapped more than the VFS_MKDIR call as I wanted all the inheritance calculations to also be done on the correct parent directory.

That created the new race you found, as the inheritance calculations call a code path that uses become_root()/unbecome_root().

The initial check for being under the share path using realpath inside safe_pathname_start() is actually good, and does catch a symlink racer, but the problem is in the unbecome-root() code it can call chdir(conn->old_cwd) *again*, after the correct check. That then opens up the symlink redirect race again that we just protected ourselves against.

My new patch eliminates that chdir() in the unbecome root code, leaving the initial check as a  direct check.

I think this is the correct fix, as become_root()/unbecome_root() pairs should only be changing credentials, not directories. We only didn't notice it was doing so before as a side effect of code reuse as it was doing a (considered harmless) chdir($cwd).

As I say, I have an updated fix I'll upload once I'm on the ground. Should fix the smbd crashes we've been seeing too.
Comment 22 Jeremy Allison 2019-06-04 00:51:13 UTC
(In reply to hansmi from comment #19)

FYI O_NOFOLLOW is a horrible misnomer as it never does what you need it to :-). It only acts on the last co.ponent of the pathname, not the full path which makes it singularly useless unless you're already in the correct directory :-).

There's a reason I had to write my own symlink following code when I fixed this bug for the open() case :-).
Comment 23 Jeremy Allison 2019-06-04 01:17:06 UTC
Created attachment 15215 [details]
Updated git-am fix for master

Through the magic of usb-on-the-go I got access to my updated patch on the plane. Michael, can you try this one ? I think it should fix the second race you found.
Comment 24 Jeremy Allison 2019-06-04 01:23:15 UTC
ACH. Never mind. Note to self, don't try and rush things on the plane even if you can't sleep. That fix would lose the correct value of current_user.need_chdir in the unbecome_root() call. I'm withdrawing it.

I'll fix it properly when I'm on the ground again. :-). Sorry.
Comment 25 Andrew Bartlett 2019-06-04 01:28:21 UTC
(In reply to Jeremy Allison from comment #24)
I'm pretty sure they call that going "above and beyond" the call of duty ;-)

Thanks for all your work chasing this.
Comment 26 Jeremy Allison 2019-06-04 01:29:07 UTC
Comment on attachment 15215 [details]
Updated git-am fix for master

Withdrawing. Will post correct fix soon.
Comment 27 Jeremy Allison 2019-06-04 01:30:48 UTC
(In reply to Andrew Bartlett from comment #25)

No worries Andrew, this bug has been costing me sleep since it came in. It's only Google money for the in-flight wifi :-).
Comment 28 Jeremy Allison 2019-06-04 01:31:07 UTC
(In reply to Andrew Bartlett from comment #25)

No worries Andrew, this bug has been costing me sleep since it came in. It's only Google money for the in-flight wifi :-).
Comment 29 Jeremy Allison 2019-06-04 03:14:42 UTC
Created attachment 15217 [details]
git-am fix for master v3

Ok, so this version includes the fix for the chdir() race in unbecome_root() as patch #1, the safe_pathname_start()/_end() fix as patch #2 ( including Michael's use-after free fix, thanks) and patch #3 is a fix for the initial pathname walk returning the incomprehensible NT_STATUS_NOT_SUPPORTED from the I itial pathname walk (missing call to nt_status_to_unix() call in OpenDir_safely() when the check_Path() function caught the symlink racer - should be NT_STATUS_ACCESS_DENIED).

In a future version of this patch set I'll fix OpenDir_safely() to call the safe_pathname_start()/_end functions instead of leaving the same logic in two places. Should make it easier to review and decide it's safe.

I've been bashing this on my laptop for an hour or so and it seems to fix the symlink race escapes to /tmp for me.

Michael if you could look this over and test I'd be grateful ! Once you've confirmed the approach is sound I'll fix minor, symlink, rename and link to use the same common code.

Thanks for your patience !
Comment 30 hansmi 2019-06-04 21:21:53 UTC
(In reply to Jeremy Allison from comment #22)
Thank you for reminding me of the pitfalls of O_NOFOLLOW.

(In reply to Jeremy Allison from comment #29)
Good news: No escapes to be found after running my reproduction script against the patch from attachment 15217 [details] for an extended period of time.

Thanks a lot for your work!
Comment 31 hansmi 2019-06-04 23:26:54 UTC
(In reply to Jeremy Allison from comment #1)
> Rename will suffer from the same problem for the destination
> path specifier.

After reading a bit more code I suspect all or most cases where more than a mere filename is passed to system calls is susceptible, e.g.:

* SMB_VFS_CHMOD used by file_set_dosmode calls chmod(smb_fname->base_name, mode).
* Most xattr functions, also used by file_set_dosmode, are similar in that they don't operate on file descriptors.

Do you want me to attempt a proof-of-concept script for these?

vfs_chown_fsp could probably benefit from safe_pathname_start()/safe_pathname_end().
Comment 32 Jeremy Allison 2019-06-05 07:42:26 UTC
Yes please. Keep them coming. We'll have to fix as many as we can find without breaking the VFS ABI.

It's possible there may be some unfixable meta-data ones but let's cross that bridge when we come to it.
Comment 33 Jeremy Allison 2019-06-05 15:26:27 UTC
The VFS ABI doesn't currently allow getting/setting xattrs on symlinks as I think Solaris and FreeBSD do not allow xattrs on symlinks, so we only have VFS_SETXATTR/VFS_GETXATTR - there are *no* syscalls getxattrat()/setxattrat().

We may not be able to fully harden the XATTR calls.
Comment 34 Jeremy Allison 2019-06-06 13:59:14 UTC
OK, I think I have a plan to proceed. I'm planning to remove all use of VFS_CHMOD() calls from Samba, due to it being unsafe to use on Linux (fchmod requires a real file descriptor, which would break the kernel oplocks) by restructuring some code to insist on file handles instead of pathnames.

Also remove the set_acl_XXXX() calls and use fset_acl_XXX() calls instead as we can insist that opening a handle for WRITE_DAC|READ_CONTROL|WRITE_OWNER open a real fd underneath (these access masks already break oplocks, so it's safe for kernel oplocks).

Secondly I think we can change all the get/set/listxattr calls with their lget/lset/llist alternatives *below* the VFS layer (at the syscall layer). The reason for this is that we already don't work on Windows calls made against symlink last-component targets - as we never call VFS_CHOWN, we only *ever* call VFS_LCHOWN which doesn't follow symlinks on last component.

Given that, I think I can start to bleed fixes into the master tree as 'normal' bugfixes to do preparatory work for the security fix.

Eventually, get to the point where I can add in the safe_pathname_start()/safe_pathname_end() calls to wrap the multiple-component paths and really fix this.

This is going to take a while I think, so I'll start to send in the first preparatory patches in the next few weeks, and keep building towards a complete fix.
Comment 35 Jeremy Allison 2019-06-13 16:56:56 UTC
I think I'm going to log bugs for the prerequisite patches, and make this bug a dependency on them. That way I can ensure the preparatory work gets into the supported release branches and I can keep a list of everything that I needed to change to get to a complete fix.

I still have an issue with acl_set_file() having no symlink-safe variant though for setting default directory POSIX ACLs. Might need to raise that with Red Hat security.
Comment 36 Jeremy Allison 2019-06-13 16:57:37 UTC
Adding metze to the CC: list as he's going to have to review some of the preparatory patches/bugs.
Comment 37 Jeremy Allison 2019-06-17 16:21:59 UTC
Created bug:

https://bugzilla.samba.org/show_bug.cgi?id=14000

to track the unbecome_root() part of the fix. Note this doesn't reference this bug as a dependency as I don't want to call attention to this until all my ducks are in a row.

I think I can get the prerequisite patches added as back-ported code cleanups first, then add the real fix on top (which should then be smaller).
Comment 38 Jeremy Allison 2019-06-18 17:08:57 UTC
Metze - had an epiphany in the shower this morning (where I have all my best ideas).

Your instinct is correct, vfs_Chdir() must *always* call check_reduced_name() inside it to ensure we're always within the share.

I'll investigate further and prepare preliminary patches.
Comment 39 Jeremy Allison 2019-06-19 22:13:37 UTC
Merge request:

https://gitlab.com/samba-team/samba/merge_requests/558

is also a prerequisite for the fixes here. It moves the POSIX ACL code to handle-based, not path based calls (except for DEFAULT_ACL on directories, but I have a fix for that to come later).
Comment 40 Jeremy Allison 2019-06-20 22:54:05 UTC
In case I forget, I also need to wrap ntimes()
Comment 41 Jeremy Allison 2019-06-20 22:58:51 UTC
And chflags()
Comment 42 hansmi 2019-06-26 00:06:47 UTC
> * SMB_VFS_CHMOD used by file_set_dosmode calls chmod(smb_fname->base_name, mode).

I spent some time working on this code. The share escape as with mkdir is likely possible (not tested), but I thought I'd found a way to call chmod as root via "source3/smbd/dosmode.c:file_set_dosmode". Fortunately it doesn't appear to be the case as it operates on a file descriptor, or is supposed to at least. My research led to the discovery of bug 14013.
Comment 43 hansmi 2019-06-26 00:27:47 UTC
> * Most xattr functions, also used by file_set_dosmode, are similar in that they don't operate on file descriptors.

There's not even a need to produce a proof-of-concept script for this one. The system calls are sufficient to confirm ("l" could be a symlink by the time of the call):

---
getxattr("l/dummy", "user.foo", "bar", 256) = 3
setxattr("l/dummy", "user.foo", "bar", 3, 0) = 0
---
Comment 44 Jeremy Allison 2019-06-26 00:30:06 UTC
Yes, I'm well aware of all these unfortunately. I'm looking into the best ways to fix this at the moment.
Comment 45 Jeremy Allison 2019-06-26 00:35:28 UTC
Read comment:

https://bugzilla.samba.org/show_bug.cgi?id=13979#c34

for my current evaluation and plans. This may end up being too complicated, and we may have to break the VFS ABI and go directly to a chroot("/root/of/exported/share") solution if it's too hard to fix individually.

That has other complications of course, but may end up being preferred in the long run.
Comment 46 hansmi 2019-06-26 18:55:05 UTC
> Read comment: https://bugzilla.samba.org/show_bug.cgi?id=13979#c34

Sorry, must've overlooked that one. I didn't consider locks at all in my
thinking and they indeed make everything more challenging. I'll have to
leave this to you and the other experts.

> This may end up being too complicated, and we may have to break the VFS
> ABI and go directly to a chroot("/root/of/exported/share") solution if
> it's too hard to fix individually.

Wouldn't that prevent the parallel use of multiple shares?
Comment 47 Jeremy Allison 2019-06-26 19:20:05 UTC
(In reply to hansmi from comment #46)

> Wouldn't that prevent the parallel use of multiple shares?

Only for meta-data ops which we don't currently do in parallel anyway. It doesn't affect the parallel work we already do for preadv, pwritev in a thread-pool.
Comment 48 Jeremy Allison 2019-06-26 23:25:38 UTC
Recording the current evaluation state to allow any vendor security professionals to be able to comment.

----------------------------------------------------------------------

My current estimates are : 1-2 weeks
(it's 4th of July week in the USA) to
determine the correct technical solution
to the problem. The current 3 candidates are:

1). chroot() - which my gut feeling won't
work for us but I'm getting people who
are smarter than I am to evaluate for
suitability.

2). Wrap incoming SMB1/2/3 calls with parse/chdir
to path-last component and only operate on
last component throughout the internal code.
(this requires large logic code changes inside
the fileserver).

3). Wrap individual VFS pathname based calls
with the safe_chdir() mechanism I invented for
the mkdir point fix. This means less logic code
changes in the main server, but has a larger
performance penalty.
Comment 49 Jeremy Allison 2019-06-26 23:26:23 UTC
Note the previous comment gives the time for evaluation *only*. Time to code up the complete solution will of course be much longer.
Comment 50 Ralph Böhme 2019-06-29 13:40:16 UTC
As discussed off-the-record, a candidate 4 to fix the problem looks like this:

* Add [*at variants of all relevant (syscall wrapping) path-based VFS  
  functions to the VFS, eg fstatat() for stat(), renameat() for rename() asf.

* ...thereby breaking the VFS ABI

* For VFS functions where there's no *at replacement, do the following:

    unrootfd = open("/", ...); // likely cached
    chdir(parent_dir);
    check_were_still_in_share();
    chroot(".");
    syscall();
    fchdir(unrootfd);

  As an alternative, chroot() to the share root may be enough, not sure.

* make sure handle based VFS functions are used where available, looking at  
  you file_set_dosmode() ...

* Finally remove remaining path-based VFS functions
Comment 51 Jeremy Allison 2019-07-02 16:19:33 UTC
I had a follow-up thought about the solution
we designed last night, I would appreciate some
comments on the problem.

Essentially the design we came up with was
SMB_VFS_CREATE_FILE() saving us by returning
an fsp for the containing directory for any
pathname that we then pass to all existing pathname-based
calls such as ret = LSTATAT(fsp, smb_fname);

However, SMB_VFS_CREATE_FILE() itself can't take
an fsp for the containing directory, as we then have
a chicken and egg problem - it still has to operate
on paths relative to the current $cwd.

So the pathname calls we use in the codepaths called
from inside SMB_VFS_CREATE_FILE() can't take an fsp, as
we don't have one yet. This isn't a problem with the
code inside there to open the file descriptor, as
it's written to be symlink safe.

So we have to SMB_VFS_CHDIR() (which will be a
remaining pathname based call with no fsp) to
the containing directory, and then use
the fd_open wrapper call (again, not taking an fsp
as it's symlink safe) in order to get a fd on the
"." directory, then fake up an fsp pointer to pass
into the pathname calls.

The POSIX syscalls fix this by having a special
AT_CWD parameter to be used meaning current directory
when an application has no file descriptor to be
relative to.

Should we create the fake-fsp for "." inside
SMB_VFS_CREATE_FILE(), or pass in a special
pointer value (possibly not NULL so we can
better detect programming errors) that is
the equivalent of FSP_AT_CWD ?

This adding complexity which I was hoping to
avoid, but I don't see any other way around
the chicken-and-egg issue with SMB_VFS_CREATE_FILE(),
can you all ?

Jeremy.
Comment 52 Jeremy Allison 2019-07-03 16:17:14 UTC
More dumps from the current discussion. Maybe we just need to move the discussion onto this bugid.

---------------------------------------------------------
If we go the full track, what we would need to do is to create a fake
fsp for the share root and from there step-by-step open the
subdirectories. The share root fsp could stay open, and I would not be
too worried about a share mode entry for that.

Volker
Comment 53 Jeremy Allison 2019-07-03 16:17:34 UTC
> If we go the full track, what we would need to do is to create a fake
> fsp for the share root and from there step-by-step open the
> subdirectories. The share root fsp could stay open, and I would not be
> too worried about a share mode entry for that.

That means walking the pathname component by
component every time, which I was hoping to
avoid. If we have to, we have to though.

It would mean doing the inital stat-cache lookup
given the client passed-in pathname, to try and
avoid any directory searches, then starting from
the fake fsp share root open directories step by
step down the path, no enumerating them unless
an open on the next path component fails, returning
the last parent directory handle fsp as well as the
last component.

Means a rewrite of unix_convert though, but
I'm guessing that's going to be on the cards
whatever we chose :-(.

We could cache the open directory handles
on the connection struct for the share. We'd
have to have some heuristic for closing them
though, otherwise we're going to drown in
opened fsp handles. As most filesystem access
tends to be localized in a single directory
(except for special cases like tree traveral
for backup or ACL modificaion) that shouldn't
be so bad.

That would mean SMB_VFS_CREATE_FILE() takes
a parent_fsp pointer, always. We then pass
in only the last component name to SMB_VFS_CREATE_FILE(),
relative to parent_fsp. To keep the logic around
fsp->fsp_name untouched we'd then need
to store the combination of:

parent_fsp->fsp_name + last_component
name as the fsp_name of the being created
fsp.

Does that work for you ?

Jeremy.
Comment 54 Jeremy Allison 2019-07-03 16:17:53 UTC
> That means walking the pathname component by
> component every time, which I was hoping to
> avoid. If we have to, we have to though.

If we're looking for a clean solution, that might be the way to go.
For performance, we should do caching of course. Regarding your worry
about security: We have something similar already with the vuid cache
taking care of something like the force user and group parameters. It
is if course user+share specific. Not sure how this works with changed
permissions over time.

> It would mean doing the inital stat-cache lookup
> given the client passed-in pathname, to try and
> avoid any directory searches, then starting from
> the fake fsp share root open directories step by
> step down the path, no enumerating them unless
> an open on the next path component fails, returning
> the last parent directory handle fsp as well as the
> last component.

Let me brainstorm about the stat cache: Result from a stat cache
lookup should be a directory fsp (in case we have it cached) plus a
case-normalized name? If either fails, we do the top-down step by step
search, every time returning a directory fsp, right?

> Means a rewrite of unix_convert though, but
> I'm guessing that's going to be on the cards
> whatever we chose :-(.
>
> We could cache the open directory handles
> on the connection struct for the share. We'd
> have to have some heuristic for closing them
> though, otherwise we're going to drown in
> opened fsp handles. As most filesystem access
> tends to be localized in a single directory
> (except for special cases like tree traveral
> for backup or ACL modificaion) that shouldn't
> be so bad.

For the caching heuristic: We might want to keep the full paths to all
concurrently open files open. At least the directories for all open
files, right?

Volker
Comment 55 Jeremy Allison 2019-07-03 16:18:11 UTC
> For performance, we should do caching of course. Regarding your worry
> about security: We have something similar already with the vuid cache
> taking care of something like the force user and group parameters. It
> is if course user+share specific. Not sure how this works with changed
> permissions over time.

I think the fsp struct contains the vuid of the opener,
so we can cache generically and just check it matches
I think.

> > It would mean doing the inital stat-cache lookup
> > given the client passed-in pathname, to try and
> > avoid any directory searches, then starting from
> > the fake fsp share root open directories step by
> > step down the path, no enumerating them unless
> > an open on the next path component fails, returning
> > the last parent directory handle fsp as well as the
> > last component.
>
> Let me brainstorm about the stat cache: Result from a stat cache
> lookup should be a directory fsp (in case we have it cached) plus a
> case-normalized name? If either fails, we do the top-down step by step
> search, every time returning a directory fsp, right?

Oh, that's a nice idea ! I like that a lot.

> For the caching heuristic: We might want to keep the full paths to all
> concurrently open files open. At least the directories for all open
> files, right?

Yes, that makes sense.

Jeremy.
Comment 56 Jeremy Allison 2019-07-03 16:18:26 UTC
> If we go the full track, what we would need to do is to create a fake
> fsp for the share root and from there step-by-step open the
> subdirectories. The share root fsp could stay open, and I would not be
> too worried about a share mode entry for that.

Thinking a little more, to create a fake fsp for
the share root we still need to be able to do:

int fd = SMB_VFS_OPENAT(conn, SMB_AT_CWD, smb_fname, mode);

inside the 'create_fake_fsp' code.

- the SMB_AT_CWD removes the chicken and egg problem
when creating the fake_share_root_fsp->fh->fd value.

But that would then be the only place that needs
this special value.

Jeremy.
Comment 57 Jeremy Allison 2019-07-03 16:18:54 UTC
> So we have to SMB_VFS_CHDIR() (which will be a
> remaining pathname based call with no fsp) to
> the containing directory, and then use
> the fd_open wrapper call (again, not taking an fsp
> as it's symlink safe) in order to get a fd on the
> "." directory, then fake up an fsp pointer to pass
> into the pathname calls.

I think SMB_VFS_CHDIR() should go and we'd only have SMB_VFS_FCHDIR().

SMB_VFS_CONNECT() could return an fsp to the share root.

> The POSIX syscalls fix this by having a special
> AT_CWD parameter to be used meaning current directory
> when an application has no file descriptor to be
> relative to.
>
> Should we create the fake-fsp for "." inside
> SMB_VFS_CREATE_FILE(), or pass in a special
> pointer value (possibly not NULL so we can
> better detect programming errors) that is
> the equivalent of FSP_AT_CWD ?
>
> This adding complexity which I was hoping to
> avoid, but I don't see any other way around
> the chicken-and-egg issue with SMB_VFS_CREATE_FILE(),
> can you all ?

I also think we should have some kind of a lightweight fsp
that won't be stored in locking tdb. that would just represent
a path component.

Then we could split the VFS related stuff of unix_convert()
into a SMB_VFS_RESOLVE_PATH() function that.

I just realized that we can't use chroot if we want to
keep symlinks within the share to work. As the symlinks
may point to something not intended if '/' is something different.

What are the problematic syscalls, where we need to simulate
f*() or *at(NO_FOLLOW) even on Linux?

metze
Comment 58 Jeremy Allison 2019-07-03 16:20:13 UTC
One more thought whilst we hash this out.

If Volker is right, and even if the security
bug becomes known we can do a limited 'disable symlinks'
patch until we rewrite the pathname code in the
fileserver, then we'd be in better shape if
the next supported release (4.11.0) disabled
SMB1 by default.

Metze and I discussed this a little while ago.

What would it take to get this done in the
meantime ? I'm guessing it's mostly changing
the test environment setups to explicitly
enable protocol rev NT and below and then
changing the default server min protocol
to be SMB2.

Is this something one of you guys can do
whilst the complete solution gets worked
on ?

Jeremy.
Comment 59 Jeremy Allison 2019-07-03 16:41:57 UTC
Metze wrote:

> I also think we should have some kind of a lightweight fsp
> that won't be stored in locking tdb. that would just represent
> a path component.

Currently we use INTERNAL_OPEN_ONLY in oplock request to signify an open that won't break oplocks.

From source3/include/smb.h we have:

561 /*
562  * Bits we test with.
563  * Note these must fit into 16-bits.
564  */
565 
566 #define NO_OPLOCK                       OPLOCK_NONE
567 #define EXCLUSIVE_OPLOCK                OPLOCK_EXCLUSIVE
568 #define BATCH_OPLOCK                    OPLOCK_BATCH
569 #define LEVEL_II_OPLOCK                 OPLOCK_LEVEL_II
570 #define LEASE_OPLOCK                    0x100
571 
572 /* The following are Samba-private. */
573 #define INTERNAL_OPEN_ONLY              0x8
574 /* #define FAKE_LEVEL_II_OPLOCK         0x10 */   /* Not used anymore */
575                                 /* Client requested no_oplock, but we have to
576                                  * inform potential level2 holders on
577                                  * write. */
578 /* #define DEFERRED_OPEN_ENTRY          0x20 */   /* Not used anymore */
579 /* #define UNUSED_SHARE_MODE_ENTRY      0x40 */   /* Not used anymore */
580 /* #define FORCE_OPLOCK_BREAK_TO_NONE   0x80 */   /* Not used anymore */
581 
582 /* None of the following should ever appear in fsp->oplock_request. */
583 #define SAMBA_PRIVATE_OPLOCK_MASK (INTERNAL_OPEN_ONLY)
584 
585 #define EXCLUSIVE_OPLOCK_TYPE(lck) ((lck) & ((unsigned int)EXCLUSIVE_OPLOCK|(unsigned int)BATCH_OPLOCK))
586 #define BATCH_OPLOCK_TYPE(lck) ((lck) & (unsigned int)BATCH_OPLOCK)
587 #define LEVEL_II_OPLOCK_TYPE(lck) ((lck) & (unsigned int)LEVEL_II_OPLOCK)

As we have 16-bits here we could always add a PATH_RESOLVE_OPEN flag as 0x200
and change the open code to not create entries in locking.tdb.

Jeremy.
Comment 60 Jeremy Allison 2019-07-08 16:02:05 UTC
Damn. In order to move all directory handling to handle-based, ceph needs to have a working ceph_fdopendir(). Gluster already has this, but ceph does not.

Pinging David Disseldorp to try and get this added.

Jeremy.
Comment 61 David Disseldorp 2019-07-08 17:36:04 UTC
(In reply to Jeremy Allison from comment #60)
> Damn. In order to move all directory handling to handle-based, ceph needs to
> have a working ceph_fdopendir(). Gluster already has this, but ceph does not.
> 
> Pinging David Disseldorp to try and get this added.

Thanks for the ping. I'm also adding Jeff Layton to the CC list, who's the Linux CephFS maintainer.
Regarding the lack of ceph_fdopendir(). We may be able to instead use the "low-level" cephFS API to pin parent directories before child creation. Both ceph_ll_create() and ceph_ll_mkdir() accept a parent inode pointer alongside the child name.
The ceph_ll_walk() AT_SYMLINK_NOFOLLOW flag looks like it also might be helpful, but there's next to no documentation for these low-level functions, so I'll need some time to take a closer look.
Comment 62 Jeremy Allison 2019-07-08 17:43:54 UTC
> Regarding the lack of ceph_fdopendir(). We may be able to instead use the "low-
> level" cephFS API to pin parent directories before child creation.

Unfortunately this isn't just about child creation. We need to fix all pathname useages in the fileserver to be symlink race safe.

I've already changed the open file code to always get a real file descriptor when opening a directory (via the symlink-race safe fd_open(path, O_DIRECTORY) - in my next set of changes I wanted to remove the calls to open_dir_safely() -> OpenDir_internal() inside of source3/smbd/dir.c and leave us with only the file-descriptor versions of reading a directory. open_dir_safely() is symlink-race safe but is costly and has to do vfs_ChDir() calls and adds a lot of complexity to the directory handling.

If you can help me move this inside of the vfs_ceph.c module until ceph gets a working ceph_fdopendir() call then I can simplify our directory handling code *significantly* (I'm already planning to remove the SMB1 reply_search() caching of partially open directory handles insanity). That might make it possible for others to understand this code :-).
Comment 63 Jeremy Allison 2019-07-16 15:42:57 UTC
I had an idea about this bug. I'd like some feedback.

This is going to be a long slog, re-writing the pathname processing in the server. I've nearly finished a patchset moving all directory enumeration to be handle based rather than path based, which is a prerequisite, but there's lots more to be done.

However, we can split this bug into two components.

1). mkdir escape - this is the most important error as it can allow object creation outside of the share definition. The fix for this is similar to the fixes I did for the open escape bug, CVE-2017-2619.

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-2619

2). meta-data escape. This is long term and insidious as it affects all read/write of object meta-data. We've stared into the abyss on this one, and we can't look away.

We already have a patch for part #1, and we're about to do a security release for a share escape bug:

https://bugzilla.samba.org/show_bug.cgi?id=14035

What do people think about splitting off part #1, asking for another CVE for it and shipping it in conjunction with the bugfix for CVE-2019-10197 (Samba bugid #14035) ? It's slightly risky as any attention in this area may put a spotlight on things we can't yet fix, but if we bundle this with the other CVE that is less likely.

We will still work on part #2 as ongoing fixes - indeed that's my 100% project work right now.

Michael, you are the reporter - you would be listed as discoverer for the first fix, as well as part #2 when we get to fix it.

Any comments ?

Jeremy.
Comment 64 hansmi 2019-07-16 21:34:12 UTC
(In reply to Jeremy Allison from comment #63)
I think splitting the security issues is sensible.
Comment 65 Jeremy Allison 2019-07-17 22:39:49 UTC
I'm finally making some progress on this. I have a patchset that moves all SMB1/2/3 client requested directory handling to file-based calls on a open directory fsp.

It doesn't fix any of the pathname issues relative to that directory handle, so it's safe to add to master, but it does give us an fsp we can use to do relative lookups once the VFS updated.

I kind of needed to do this before deciding if the SMB_VFS_XXXAT(fsp, smb_filename, XXX) approach was going to work.

Hopefully should get through github review soon.
Comment 66 Jeremy Allison 2019-07-18 23:30:00 UTC
(In reply to hansmi from comment #64)

> I think splitting the security issues is sensible.

FYI, we're still discussing this. We won't ship a fix for:

https://bugzilla.samba.org/show_bug.cgi?id=14035

until after I get back from vacation on July 29th so I'm planning to pick this up again after I return.
Comment 67 Jeremy Allison 2019-08-08 23:05:20 UTC
Making progress on this (finally). First handle-based VFS changes should go in shortly.
Comment 68 Jeremy Allison 2019-08-14 17:28:01 UTC
From an email thread on this topic:

--------------------------------------------------------------
We need to rewrite the fileserver
to by arbitrary symlink race change safe on all pathname
operations. This is too large to do in private, so
I'm doing this in public under the guise of "modernising
the VFS to use handle-based operations" (without saying
explicitly *why* I'm doing so).

Table stakes here is to move all our VFS operations
to the XXXXAT() style of interface, so that's what
I'm trying to do as rapidly as possible.

What I need right now is help and review on these
code changes (which are explicitly *not* changing
our current pathname logic, just migrating the
interfaces from things like : SMB_VFS_RENAME -> SMB_VFS_RENAMEAT
whilst using the magic SMB_VFS_AT_FDCWD value as
the srcfsp or dstfsp pointer value to mean the same
as the syscall AT_FDCWD value which means "do this in the
current directory".

Once all our VFS operations are updated to use
this approach, then we have the ability to actually
go on and fix the problem.

But make no mistakes, fixing it is going to be
really hard as there is so much existing code
in existing VFS modules like shadow_copy2,
fruit, virusscanner, recycle, streams_depot
etc. that expects incoming pathnames to be
$CWD + smb_fname_share_relative_path.

Changing this to work and keep all our tests
working to cope with :

cwd_fsp + smb_fname_last_component_path

is going to be really difficult. As smb_fname
already has a value for:

        char *original_lcomp;

I'm thinking of maybe keeping the
existing smb_fname_share_relative_path
in:

        char *base_name;
        char *stream_name;

and using original_lcomp as the
last component relative pointer
to the component in "base_name"
after any last '/' character
in order to avoid all the horrendous
extra string manipulations we'll
need.

But without SMB_VFS_XXXXXAT() we're
not even in the place where we can
start on that yet.

So right now the help I really want
is doing the boilerplate VFS code
updates (or even just rapid code
review, I know what needs doing and
can crank out this code myself if
needed) to get us to SMB_VFS_XXXXXAT().
--------------------------------------------------------------
Comment 69 Jeremy Allison 2019-09-04 00:07:03 UTC
Just an update - migration of the VFS in master to XXXat() calls is ongoing. Once this is done we can start on the security fixes.
Comment 70 hansmi 2019-09-04 17:25:50 UTC
(In reply to Jeremy Allison from comment #69)
Thank you very much for your work!
Comment 71 Jeremy Allison 2019-09-04 17:31:47 UTC
(In reply to hansmi from comment #70)

Unfortunately this is only beginning. Once the XXXAT() migration is finished then I need to rewrite the core fileserver logic to remove all possible symlink races. No end in sight yet :-(.
Comment 72 Jeremy Allison 2019-10-11 21:11:41 UTC
Making slow but steady progress on the VFS rewrite. SMB_VFS_CHOWN (pathname based) removal is the next one awaiting review.

I have one design decision I'd like to make though.

Currently we have a parameter "widelinks = [yes|no]" which we use to allow symlinks to escape from a share definition or not.

Yes I know it's insane, but there are existing customers out there using this, and I'd like to break as little as possible going forward.

What I'd like to do is remove the "widelinks =" parameter and replace it with a non-stackable VFS module vfs_widelinks.

This would then remove the AT_SYMLINK_NOFOLLOW flags on any passed in SMB_VFS_XXXAT() calls which will implicitly make symlinks become invisible to the upper layer code.

I'd need to also change the SMB_VFS_REALPATH call to always return the absolute path created by appending the share definition path with the path passed in to realpath (again making it appear that symlinks don't exist and are always followed).

If I can confine this to one VFS module, then I can change the rest of the server to never follow  symlinks outside the share without having to add lots of 'unless widelinks = yes' logic all over the place (look at the existing code inside the fd_open() code inside source3/smbd/open.c for an example of the what effect this has).

Essentially I can then ignore the little-used widelinks option whilst still allowing people who depend on this to add a VFS module that closely (probably not 100% identically) reproduces the old behavior.

This will simplify the make all pathnames safe logic immensely.

I'd need to (soon, before next release) mark the 'widelinks' parameter as deprecated and soon for removal if everyone agrees with this.

Thoughts ?
Comment 73 Andrew Bartlett 2019-10-12 07:04:16 UTC
(In reply to Jeremy Allison from comment #72)

Why not keep the parameter, but have it simply control the behaviour of the widelinks VFS module? 

That way the number of sites impacted is much less, most will have a default vfs objects and widelinks be in it.

What I propose is that the parameter remain, but would control if that module did anything (or instead, did nothing). 

We would still have a upgrade challenge for sites with a vfs objects value set (so the new default wouldn't help), but the combination of those who also had wide links set will be far fewer.
Comment 74 Jeremy Allison 2019-10-13 02:11:32 UTC
> Why not keep the parameter, but have it simply control the behaviour of the
> widelinks VFS module? 

Because I'm not suggesting that the widelinks vfs module be loaded by default, that would be a very bad idea. The widelinks module I'm proposing will touch the core functionality inside fstatat, realpath and readdirplus which I really don't want in the 'normal' VFS code paths.

So when I move the widelinks behavior into the module and out of the core smbd code (which is the only sane way to fix this thing) then having the parameter as well is an extra switch controling exactly what loading the module already does.

Allowing wide links is a very strange behavior, I wish we'd never allowed it.

The number of sites that use/need wide links IMHO is small enough that on upgrade asking them to add a vfs objects entry to shares that require it instead seems to be a small mandatory change to insist on.

If we leave the parameter alone, then the sites that used it won't understand why the functionality stops working when the parameter is set because they didn't notice they also need to add 'vfs objects = widelinks'.

Does that help explain my reasoning ?
Comment 75 Ralph Böhme 2019-10-13 06:31:04 UTC
(In reply to Jeremy Allison from comment #72)
I like the idea of moving the behaviour to the VFS, but I'd prefer to not add a VFS module and if possible do it in vfs.c with a macro

# define SMB_VFS_AT_FLAGS(handle, flags) \
    if (handle == handle->conn->vfs_handles) { \
Comment 76 Andrew Bartlett 2019-10-13 06:32:39 UTC
(In reply to Jeremy Allison from comment #74)
I understand the reasoning, I just disagree as to the breaking change, because I think there is a way to keep existing configurations working.

So, I think that we should try to is either have the 'vfs defaults' change based on 'wide links' or have the default static but have the widelinks module be a no-op when 'wide links = no'.

This is just a matter of how the module comes to be configured, not the core design you are trying to advance. 

This way we:

 * avoid needing to deprecate the parameter (and all the notice stuff that goes with that)
 * avoid breaking sites that only upgrade Samba when the whole OS is upgraded (and never read our see our release notes)

Is this clearer?
Comment 77 Ralph Böhme 2019-10-13 06:35:24 UTC
Drat, somehow hit save to early... so:

# define SMB_VFS_AT_FLAGS(handle, flags) \
    if (handle == handle->conn->vfs_handles) { \
        if (lp_widelinks(handle->conn)) { \
            flags &= ~AT_SYMLINKS_NO_FOLLOW; \
        } \
    }

Then every smb_vfs_call_* would call the before VFS_FIND().

The only problem I see is that someone forgets to add a call to SMB_VFS_AT_FLAGS() when adding new VFS functions.
Comment 78 Ralph Böhme 2019-10-13 06:39:55 UTC
(In reply to Andrew Bartlett from comment #76)
I guess if we have to make it a module, we can force that it will always be the first in the stack.

Then the behaviour of the module would be controlled by lp_widelinks().

I'd like to avoid adding a module for this if possible, we already have too many to maintain...
Comment 79 Jeremy Allison 2019-10-13 16:57:07 UTC
(In reply to Ralph Böhme from comment #77)

Adding this logic:

# define SMB_VFS_AT_FLAGS(handle, flags) \
    if (handle == handle->conn->vfs_handles) { \
        if (lp_widelinks(handle->conn)) { \
            flags &= ~AT_SYMLINKS_NO_FOLLOW; \
        } \
    }

has to be done as part of *every* VFS call that takes a flags parameter. There are a lot of them. I think this is a bad idea and adds extra complexity, not to mention the extra code / time added for these tests and if statements on every VFS call.

It's extra complexity we simply don't need if we make this an optional module. Complexity is the enemy of security.

That way the default is complete symlink safety.

1). All smbd logic is written such that smbd refuses to allow access outside the share. I can't stress this enough being the correct design in terms of long-term maintainability. Compromise here is not a good idea.

2). For people who want insecurity, all of the logic that allows insecurity is concentrated in one optionally loaded module which causes the upper layer smbd to be blind to the existence of underlying symlinks, vfs_widelinks.

This is (IMHO) much clearer and simpler to code and to maintain in the long run. It isolates the logic on one place, not smearing it out all over the core smbd code.

Backwards parameter compatibility for insane parameters like this will kill us for maintainability and code simplicity in the long run.

We are already changing a lot for this fix, insisting on a new module being added only for people who want insecurity is not IMHO unreasonable.

Adding this module as part of the default stack I also disagree with, again due to the extra code being added into every VFS call.

The ability to compromise security like this isn't something that should be built into the default code path, it really needs to be an optional loaded module.

I really don't want to code this up the way you both have suggested, I think it's a bad idea for maintainability.

It isn't worth it for the number of people affected by this change.

This decision doesn't have to be made now, as the VFS code changes are still going in. But once they are done and the shape of the new VFS is done, at that point we have to make the changes to fix the symlink escapes.

I disagree with keeping the widelinks parameter and logic inside smbd so much I'm unwilling to do further work on this bug if I'm required to maintain it. It's simply costing me too much stress in coming up with a working design. Removing widelinks from the core code paths allowed me to see a way forward to a complete and clean fix in core smbd. It really does have to be in a vfs module simply to isolate the problem. If keeping it is mandatory then others will have to do this work, sorry.
Comment 80 Jeremy Allison 2019-10-13 17:12:51 UTC
One more wrinkle. Causing smbd to not see symlinks will break MS-DFS redirects as they depend on exposing symlinks to the code in source3/smbd/msdfs.c.

One more reason not to have this available in the default codepath, as many more sites depend on msdfs than on widelinks = yes.

Long term we need to migrate ms-dfs link storage to EA's (the same way Micosoft does it).

symlinks really are a cancer inside filesystem design.
Comment 81 Jeremy Allison 2019-10-13 17:34:12 UTC
(In reply to Jeremy Allison from comment #80)

As we're making big VFS changes, we should probably also add 2 new VFS calls:

NTSTATUS SMB_VFS_MSDFS_READLINKAT(struct vfs_handle_struct *handle,
                                  struct files_struct *dirfsp,
                                  const struct smb_filename *smb_fname,
                                  char *buf,
                                  size_t bufsiz);
NTSTATUS SMB_VFS_MSDFS_CREATELINKAT(struct vfs_handle_struct *handle,
                                  struct files_struct *dirfsp,
                                  const struct smb_filename *smb_fname,
                                  char *path);

SMB_VFS_MSDFS_READLINKAT returns NT_STATUS_OBJECT_PATH_NOT_FOUND if called on a non-DFS path (i.e. not a symlink). Then way we move the existing logic that stores dfs paths in symlinks into the underlying filesystem code.

This still won't fix the widelinks MS-DFS case, as for that unix_convert() depends upon returning a SMB_STRUCT_STAT_EX struct with S_ISLNK set in order to the MS-DFS detection logic to kick in later.

Another reason why losing widelinks logic from the main code is a good idea. There's no way I can see to maintain widelinks logic, MSDFS paths stored in symlinks and symlink safety in one single codepath.
Comment 82 Jeremy Allison 2019-10-13 17:43:00 UTC
(In reply to Andrew Bartlett from comment #76)

> This way we:
>
> * avoid needing to deprecate the parameter (and all the notice stuff that goes with that)
>  * avoid breaking sites that only upgrade Samba when the whole OS is upgraded (and never read our see our release notes)
> Is this clearer?

If you don't want to deprecate the parameter, I'm willing to consider keeping it so long as its only function is to issue an error message (or warning message) saying widelinks won't work unless the vfs_widelinks module is loaded for this share.

Is that a compromise you can live with ?
Comment 83 Jeremy Allison 2019-10-13 18:11:04 UTC
(In reply to Jeremy Allison from comment #81)

> This still won't fix the widelinks MS-DFS case, as for that unix_convert() 
> depends upon returning a SMB_STRUCT_STAT_EX struct with S_ISLNK set in order to 
> the MS-DFS detection logic to kick in later.

Aha ! Looking carefully the above is not correct. In MS-DFS it depends on unix_convert() returning ENOENT -> NT_STATUS_OBJECT_PATH_NOT_FOUND to trigger the MS-DFS component search. That means vfs_widelinks *would* still work with MS-DFS so long as we have the extra VFS call SMB_VFS_MSDFS_READLINKAT() that has an  implementation inside vfs_widelinks that does the LSTAT()/READLINK() calls internally to check for MS-DFS objects stored in links.

The initial pathname conversion on a MS-DFS symlink foo where (foo -> msdfs:\\dfs\redirect\path) does an unconditional STAT() call, gets ENOENT which then triggers the DFS pathname walk using SMB_VFS_MSDFS_READLINKAT() to check for MS-DFS components.

The vfs_widelinks module design is looking better and better :-). Seems like we can still have our cake and eat it :-).
Comment 84 Andrew Bartlett 2019-10-13 22:40:05 UTC
(In reply to Jeremy Allison from comment #82)
I think fundamentally changing a parameter into a warning message is the same as removing it, so I don't think our historical practice permits that. 

The AD DC sets different default VFS objects based on a parameter, why can't we do that?

But better still would be having this be an implicit top-level module (or such) that becomes a no-op if the option is unset.
Comment 85 Jeremy Allison 2019-10-13 23:02:33 UTC
(In reply to Andrew Bartlett from comment #84)

> The AD DC sets different default VFS objects based on a parameter, why can't we > do that?

I will take a look, that might be possible.

> But better still would be having this be an implicit top-level module (or such) > that becomes a no-op if the option is unset.

No, sorry I fundamentally disagree with that. Having "break all security" as an implicit top-level module is *NOT* a good idea going forward. It makes far more sense to have it as a completely optional loaded module for the small number of users who depend on it.

I don't understand why I can't just remove this parameter - deprecating it first as I suggested. This is going to be a massive security release - the biggest one we've ever had to do. It's a fileserver *rewrite*. Please think about the implications of that.

As I said, I'm coming to the conclusion that I need to do this, and just break backwards compatibility.

This is difficult enough to do as it is, please don't make it worse by adding additional hoops to jump through to keep old insane options working without change. I'm getting to the point of just walking away from this I'm afraid.
Comment 86 Jeremy Allison 2019-10-13 23:03:28 UTC
Separate email from Volker:

Haven't read all of the discussion in every detail, but you have my
full support for removing the wide links parameter and putting that
into a module. Even without deprecating it first, this is just a big
security release that everybody will be aware of.

Volker
Comment 87 Ralph Böhme 2019-10-14 08:28:05 UTC
(In reply to Jeremy Allison from comment #85)
>> The AD DC sets different default VFS objects based on a parameter,
>> why can't we > do that?
>
>I will take a look, that might be possible.

I don't think this will work if the user already has vfs objects set.

I'm also ok with making this an optional module that has to be explicitly enabled. And then it indeed makes no sense to keep the "wide links" option as it won't work as advertized unless the module is in the stack.

So +1 on deprecating and the module.
Comment 88 Andrew Bartlett 2019-10-21 06:08:21 UTC
I attended Kawaiicon this year and watched as a "safe" limited shell was escaped via a symlink from the restricted directory to the root.

Not directly Samba, but the same idea.

It was enough to make me rethink my position here.  

I think that the use case for 'wide links' can be reimplemented with bind mounts if really needed, so we should deprecate 'wide links' and strongly discourage even the use of the replacement module.
Comment 89 Björn Jacke 2019-10-21 11:27:57 UTC
On 21.10.19 08:08, samba-bugs@samba.org wrote:
> I think that the use case for 'wide links' can be reimplemented with bind
> mounts

have in mind that bind mounts are a Linux feature and there are
definetely other Unix-like flavours which do not have something like
that. So discouraging the use of symlinks is fine but we should not drop
support for it.
Comment 90 Jeremy Allison 2019-10-22 06:26:57 UTC
(In reply to Björn Jacke from comment #89)

Backwards compatibility for insane semantics will be the death of the project :-).

I'll try and keep this working via the VFS module method, but if I end up with the choice between security and widelinks, you can guess which is going to win that argument :-).
Comment 91 Jeremy Allison 2020-12-14 20:00:04 UTC
Yearly update. The smbd VFS rewrite proceeds apace migrating to handle-based calls. Major changes about to go in. I still have hopes of fully closing this out in 2021.
Comment 92 Jeremy Allison 2021-05-19 21:36:48 UTC
Update. VFS rewrite is still ongoing. Once this is done and everything is handle-based, this bug will finally be fixed.
Comment 93 Jeremy Allison 2021-07-14 15:53:48 UTC
With the master commit e168a95c1bb1928cf206baf6d2db851c85f65fa9, I believe all race conditions on meta-data are now fixed in the default paths. The async DOS attribute read still uses path-based getxattr, and some of the VFS modules are not symlink safe, but out of the box Samba I believe will no longer be vulnerable to this in 4.15.0.
Comment 94 Jeremy Allison 2021-09-21 00:19:35 UTC
4.15.0 is now shipped. All known symlink races are fixed. I'll start to see about crafting a CVE text for this.
Comment 95 Jeremy Allison 2021-09-21 00:24:04 UTC
(In reply to Jeremy Allison from comment #93)

> The async DOS attribute read still uses path-based getxattr,

This is no longer true. We no longer have any path-based xattr calls in 4.15.0 (the supporting VFS calls have been removed).
Comment 96 Jeremy Allison 2021-09-22 00:43:44 UTC
Comment on attachment 15217 [details]
git-am fix for master v3

This patch is no longer needed. The bug only occurs now in 4.13, for which I have a much simpler patch.
Comment 97 Jeremy Allison 2021-09-23 18:13:38 UTC
Created attachment 16812 [details]
Patch for 4.13.next.

Needs testing against the race reproducer. Adding it here to keep a record of it.
Comment 98 Jeremy Allison 2021-09-23 19:44:17 UTC
Confirmed by walking through by hand that 4.13.next as-is is vulnerable to the race.
Comment 99 Jeremy Allison 2021-09-23 21:01:58 UTC
Confirmed by walking through by hand that 4.13.next + https://bugzilla.samba.org/attachment.cgi?id=16812
is *NOT* vulnerable to the race !
Comment 100 Jeremy Allison 2021-09-23 21:02:59 UTC
Comment on attachment 16812 [details]
Patch for 4.13.next.

Ralph, please review. Please note this patch is *ONLY* to fix the mkdirat() race, nothing else. For that we need 4.15.x and bug: https://bugzilla.samba.org/show_bug.cgi?id=14842
Comment 101 Ralph Böhme 2021-09-24 03:24:01 UTC
Comment on attachment 16812 [details]
Patch for 4.13.next.

Lgtm. Thanks!
Comment 102 Jeremy Allison 2021-09-24 18:51:43 UTC
Created attachment 16815 [details]
CVE advisory.
Comment 103 Jeremy Allison 2021-09-24 18:53:40 UTC
Once Ralph has reviewed the CVE this is ready to be sent to vendors. Note we need to coordinate with:

https://bugzilla.samba.org/show_bug.cgi?id=14842

to ensure we release both advisories at the same time.
Comment 108 Jeremy Allison 2021-10-21 15:56:44 UTC
Comment from Jim @ SuSE.

We're going to do like RH and mark earlier versions as unsafe with SMB1
(or in our oldest case, unix extensions/NFS), and update our latest two
releases (SLES12SP5 and SLES15SP3) to 4.15.
Comment 109 Jeremy Allison 2021-10-21 15:58:17 UTC
OK, as agreed upon, I'm now opening this bug up to Samba vendors.
Comment 110 Jeremy Allison 2021-10-27 16:05:29 UTC
Proposed release date - January 10th 2022.
Comment 111 Marcus Meissner 2021-11-03 09:36:02 UTC
fwiw Mitre has rejected the CVE -.. perhaps it needs an update

** REJECT ** DO NOT USE THIS CANDIDATE NUMBER. ConsultIDs: none. Reason: This candidate was in a CNA pool that was not assigned to any issues during 2019. Notes: none.
Comment 112 Jeremy Allison 2021-11-03 16:02:38 UTC
So I learned something new - CVE's can time out :-(. I'll get another one, thanks !
Comment 113 Jeremy Allison 2021-11-10 17:21:45 UTC
New CVE assigned - CVE-2021-43566.
Comment 114 Jeremy Allison 2021-11-10 17:25:11 UTC
Created attachment 16987 [details]
Updated CVE text .

NB. Only change is the CVE number from CVE-2019-10151 -> CVE-2021-43566.

No other changes. Should be an easy re-review :-).
Comment 115 Jeremy Allison 2021-11-10 17:27:53 UTC
Created attachment 16988 [details]
Updated for for 4.13.next.

Only change is CVE number CVE-2019-10151 -> CVE-2021-43566.

No other changes.
Comment 116 Ralph Böhme 2022-01-07 14:32:58 UTC
Created attachment 17072 [details]
Updated CVE text

Updated CVE text with correct version 4.13.16.
Comment 117 Jeremy Allison 2022-01-07 17:51:04 UTC
Comment on attachment 17072 [details]
Updated CVE text

LGTM. Thanks !
Comment 118 Arvid Requate 2022-01-09 12:43:36 UTC
The CVSS score in the advisory doesn't conform to v3.1:

> CVSS:2.2/AV:A/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:N/E:P/RL:O/RC:C/CR:L/IR:L/AR:L/MAV:N/MAC:H/MPR:L/MUI:N/MS:U/MC:L/MI:N/MA:N

The spec https://www.first.org/cvss/specification-document writes:

The CVSS v3.1 vector string begins with the label “CVSS:” and a numeric representation of the current version, “3.1”.
Comment 119 Samba QA Contact 2022-01-10 12:29:23 UTC
This bug was referenced in samba v4-13-stable (Release samba-4.13.16):

9c2e3c72c0cdde31a2a5c2e58ce508070ec151d0
Comment 120 Samba QA Contact 2022-01-10 12:35:44 UTC
This bug was referenced in samba v4-13-test:

9c2e3c72c0cdde31a2a5c2e58ce508070ec151d0
Comment 121 Jule Anger 2022-01-10 15:00:30 UTC
Removing vendor CC (so that any public comments don't need to be broadcast so widely) and opening these bugs to the public.  
If you wish to continue to be informed about any changes here please CC individually.