Bug 15583 - set_nt_acl sometimes fails with NT_STATUS_INVALID_PARAMETER - openat() EACCES
Summary: set_nt_acl sometimes fails with NT_STATUS_INVALID_PARAMETER - openat() EACCES
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.20.0rc2
Hardware: All All
: P5 regression (vote)
Target Milestone: 4.20
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-15 22:32 UTC by Björn Baumbach
Modified: 2024-03-27 17:13 UTC (History)
8 users (show)

See Also:


Attachments
patch that fixes the behavior (5.17 KB, patch)
2024-02-15 22:32 UTC, Björn Baumbach
no flags Details
strace - fail (7.30 MB, text/plain)
2024-02-15 22:38 UTC, Björn Baumbach
no flags Details
strace - success (12.40 MB, text/plain)
2024-02-15 22:39 UTC, Björn Baumbach
no flags Details
log - fail (1.50 MB, text/x-log)
2024-02-15 22:40 UTC, Björn Baumbach
no flags Details
log - ok (2.91 MB, text/x-log)
2024-02-15 22:40 UTC, Björn Baumbach
no flags Details
Patch for 4.20 backported from master (18.45 KB, patch)
2024-03-27 14:51 UTC, Ralph Böhme
no flags Details
Patch for 4.20 cherry-picked from master (19.12 KB, patch)
2024-03-27 15:15 UTC, Ralph Böhme
bjacke: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Baumbach 2024-02-15 22:32:17 UTC
Created attachment 18248 [details]
patch that fixes the behavior

Sometimes setting a NTACL via set_nt_acl_conn() fails with NT_STATUS_INVALID_PARAMETER. The strace shows an openat() with EACCES (Permission denied).

I can reproduce this with "samba-tool ntacl sysvolreset" or "samba-tool domain provision". When I run the same command again it fails again or sometime succeeds. I didn't reproduce it with "samba-tool ntacl set".

The attached patch reverts a commit, which solved this on my test setup.

I'll attach strace and log files which show the successful and failing "ntacl sysvolreset".
Comment 1 Björn Baumbach 2024-02-15 22:38:10 UTC
Created attachment 18249 [details]
strace - fail
Comment 2 Björn Baumbach 2024-02-15 22:39:15 UTC
Created attachment 18250 [details]
strace - success
Comment 3 Björn Baumbach 2024-02-15 22:40:03 UTC
Created attachment 18251 [details]
log - fail
Comment 4 Björn Baumbach 2024-02-15 22:40:39 UTC
Created attachment 18252 [details]
log - ok
Comment 5 Andreas Schneider 2024-02-16 15:57:16 UTC
Björn, can you create a merge request please?
Comment 6 Andreas Schneider 2024-02-16 16:00:28 UTC
Reproducer:
- install Fedore 39
- dnf install samba-dc acl # (samba 4.19)
- rm /etc/samba/smb.conf
- samba-tool domain provision --interactive
- check extra ACLs with
  getfacl /var/lib/samba/sysvol/
  feel free to check sub-directories as well
- remove extra ACLs with
  setfacl -b -R /var/lib/samba/sysvol
- check that extra ACLs are removed with
  getfacl /var/lib/samba/sysvol/
- restore extra ACLs by calling
  samba-tool ntacl sysvolreset
- check that extra ACLs are present again
  getfacl /var/lib/samba/sysvol/
- add Fedora 40 repository to /etc/yum.repos.d
- update Samba to Fedora 40 version
  dnf update samba-common # (samba 4.20.0rc2)
- check that new samba-tool is working if ACLs are present
  samba-tool ntacl sysvolreset
- remove extra ACLs with
  setfacl -b -R /var/lib/samba/sysvol
- calling samba-tool again will fail:
  samba-tool ntacl sysvolreset
set_nt_acl_conn: init_files_struct failed: NT_STATUS_INVALID_PARAMETER
ERROR(runtime): uncaught exception - (3221225485, 'An invalid parameter was passed to a service or function.')
  File "/usr/lib64/python3.12/site-packages/samba/netcmd/__init__.py", line 285, in _run
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/site-packages/samba/netcmd/ntacl.py", line 449, in run
    provision.setsysvolacl(samdb, sysvol,
  File "/usr/lib64/python3.12/site-packages/samba/provision/__init__.py", line 1754, in setsysvolacl
    _setntacl(os.path.join(root, name))
  File "/usr/lib64/python3.12/site-packages/samba/provision/__init__.py", line 1743, in _setntacl
    return setntacl(
           ^^^^^^^^^
  File "/usr/lib64/python3.12/site-packages/samba/ntacls.py", line 229, in setntacl
    smbd.set_nt_acl(
- a second run will pass and fix the ACLs
Comment 7 Andreas Schneider 2024-02-16 16:03:49 UTC
See also https://gitlab.com/samba-team/samba/-/commit/697d41420b4f4830396acfbc96bd1f1c1f0531f4 which partially reveted the code already with an explanation:

With capabilities preferred over become_root() we failed to achieve
the basic goal of storing NT ACLs in xattrs using vfs_acl_xattr. This
is due to the fact that apart from CAP_DAC_OVERRIDE it is manadatory
to have CAP_SYS_ADMIN for write access to xattrs from security
namespace[1]. Despite the option to configure the xattr name within
the module we should not anticipate and miss to consider xattrs from
security namespace which is far more protected even with our default
name "security.NTACL".

Theorotically we could make it work by adding another capability on
top of existing ones. But given the functions designed around this
area we may not be able to come up with a cleaner change which can
handle the fallback mechanism to become_root(). Any failure to set
the very first capability would put us in become_root() path where
further capabilities are mostly not required. Thus reverting to old
behaviour to always become_root() until we have a cleaner approach
to handle the fallback while modifying multiple capabilities at once.

[1] https://www.man7.org/linux/man-pages/man7/xattr.7.html
Comment 8 Ralph Böhme 2024-02-16 16:12:33 UTC
(In reply to Andreas Schneider from comment #7)
Until someone can explain why the openat() as root fails with EACCES there's still some unresolved mystery here to be understood. If you revert anything, we should probably for now revert *all* the commits from the changeset that introduced this regression.
Comment 9 Christof Schmitt 2024-02-16 22:06:12 UTC
openat only does a permission check based on the file system permissions. Can you maybe print the stat and ACL information and the current process uid and capabilities? There should be an obvious enough explanation.
Comment 10 Ralph Böhme 2024-02-17 14:18:35 UTC
I guess the problem is that we drop the DAC_OVERRIDE capability unconditionally. As a result we run as uid=0 but without DAC_OVERRIDE, so subsequently and code that relies on being run as root means you can do everything, eg during provisions, may fail with EACCESS as root is not root anymore.

We need some kind of logic in the set_effective_capability()/drop_effective_capability()  to make sure we don't drop the capability if it was present in the initial capabilites before calling set_effective_capability().
Comment 11 Stefan Metzmacher 2024-02-27 11:27:04 UTC
(In reply to Ralph Böhme from comment #10)

So the problem is that we call

set_effective_capability(DAC_OVERRIDE_CAPABILITY)
while we already have CAP_DAC_OVERRIDE in the kernel.
The drop_effective_capability(DAC_OVERRIDE_CAPABILITY),
really drops it and while we want to restore the
state we had before set_effective_capability(DAC_OVERRIDE_CAPABILITY).

I guess we need to:
- remove PR_SET_KEEPCAPS,
- move set_effective_capability() to source3/lib/smbd_shim.c
- implement or re-use the stack that's also used by become_root()
- maybe push_effective_capabilities()/pop_effective_capabilities()
  would be a better api and the pop function just resets to
  the state before the push, that would means we won't
  drop DAC_OVERRIDE_CAPABILITY if it was already there before
  the push.
Comment 12 Stefan Metzmacher 2024-02-27 13:21:14 UTC
(In reply to Stefan Metzmacher from comment #11)

The simplest regression fix would be this:

diff --git a/source3/lib/system.c b/source3/lib/system.c
index 74d5dae5446..10ea98f64d8 100644
--- a/source3/lib/system.c
+++ b/source3/lib/system.c
@@ -637,7 +637,7 @@ static bool set_process_capability(enum smbd_capability capability,
 ****************************************************************************/
 
 #if defined(HAVE_POSIX_CAPABILITIES) && defined(CAP_DAC_OVERRIDE)
-static bool have_cap_dac_override = true;
+static bool have_cap_dac_override = false; /* TODO: fix bug #15583 */
 #else
 static bool have_cap_dac_override = false;
 #endif
Comment 13 Jeremy Allison 2024-02-27 19:04:06 UTC
+1 on this API.

push_effective_capabilities()/pop_effective_capabilities()

Makes much more sense and matches our current code.
Comment 14 Jule Anger 2024-03-11 14:24:29 UTC
Any updates? Who will go on here so that we can get a fix for this into 4.20.0?
Comment 15 Andreas Schneider 2024-03-20 10:17:40 UTC
Can we get the regression fix from comment #12 as a MR?
Comment 16 Samba QA Contact 2024-03-27 10:48:03 UTC
This bug was referenced in samba master:

58ea952fd0c716f94b1b79b8ed1829bb72732ccc
87479544381e103ee2b1def574a5865a3f6a93d9
88eb58af6783ad23d2e2b602ee9fdbbdf556b354
7f19afbd40d3ad3c8d186d0a2a64d07a2a8bd00a
10c7a3e47c62dcb1dfe7e384960d60cafcb9e44e
52ad635b2705bcfc8166bd90b1ad35ebb9cbc986
af7b930e2bfe2275cee14dc2154f2aea8875fa63
33e88911ee7a8974d52021632ca25c1ddfcb6f45
32aa11e9b570ce1c0bec889b699bc4897c9d9843
0dec2ef188a93504da873d927ca2b26f8c491fb8
Comment 17 Ralph Böhme 2024-03-27 14:51:45 UTC
Created attachment 18277 [details]
Patch for 4.20 backported from master
Comment 18 Ralph Böhme 2024-03-27 15:15:12 UTC
Created attachment 18278 [details]
Patch for 4.20 cherry-picked from master
Comment 19 Jule Anger 2024-03-27 15:24:26 UTC
Pushed to autobuild-v4-20-test.
Comment 20 Samba QA Contact 2024-03-27 16:51:04 UTC
This bug was referenced in samba v4-20-test:

bb68b730290c20fe0ecc6e72580fbab6ea674692
dc1616263034a78b6a43f9f89d59a8090f881a1b
4f38859f5d861a5f77c223ad720416b719e8e2f8
d0c295e5344d7858cf75e19184e3842de06f27ab
f6d549de47c463905c5d95bc6556e2c7c4a25540
6e0986b2c30e78e0c9ffec62fb0666cd85dad316
52b1d9d7cb8d70fc1137c26c4a38c530116802c4
6ca9461a1dbee5762220f0ae9e0b67c846d4feae
f7491b2994157615032e80b5f10df5953ae0543a
5cedf3b5eb02c3050cb2e82d4602d63c565d4a7f
Comment 21 Jule Anger 2024-03-27 17:05:30 UTC
Closing out bug report.

Thanks!
Comment 22 Samba QA Contact 2024-03-27 17:13:16 UTC
This bug was referenced in samba v4-20-stable (Release samba-4.20.0):

bb68b730290c20fe0ecc6e72580fbab6ea674692
dc1616263034a78b6a43f9f89d59a8090f881a1b
4f38859f5d861a5f77c223ad720416b719e8e2f8
d0c295e5344d7858cf75e19184e3842de06f27ab
f6d549de47c463905c5d95bc6556e2c7c4a25540
6e0986b2c30e78e0c9ffec62fb0666cd85dad316
52b1d9d7cb8d70fc1137c26c4a38c530116802c4
6ca9461a1dbee5762220f0ae9e0b67c846d4feae
f7491b2994157615032e80b5f10df5953ae0543a
5cedf3b5eb02c3050cb2e82d4602d63c565d4a7f