Bug 7457 - Domain users and Domain admin ACLs corrupting
Summary: Domain users and Domain admin ACLs corrupting
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.5
Classification: Unclassified
Component: User & Group Accounts (show other bugs)
Version: 3.5.4
Hardware: x64 Linux
: P3 normal
Target Milestone: ---
Assignee: Michael Adam
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-26 12:51 UTC by andrew
Modified: 2014-07-24 07:23 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description andrew 2010-05-26 12:51:57 UTC
I am running some fully patched as of last month RHEL 5.4 servers using samba sernet 3.5.2, integrated into a mixed 2003/8 AD, pointing against a 2008SP2 DC.

This issue is occurring on both boxes, both are at the same patch level, are dedicated samba boxes, and have very similar smb.conf files.

Upon a restart of nmbd, smbd and winbind (like a machine reboot) acl resolution seems to corrupt - instead of showing "domain\040users" I get "BUILTIN+users",and instead of "domain\040admins" I get "BUILTIN+administrators"

[root@Bubbles data]# getfacl G_drive
# file: G_drive
# owner: root
# group: BUILTIN+administrators
user::rwx
group::rwx
group:BUILTIN+administrators:rwx
group:BUILTIN+users:r-x
group:r_g_drive:r-x
mask::rwx
other::---

But notice that the other domain groups resolve properly (like r_g_drive).

Users experience flaky connectivity, with some folders becoming unreadable or unwriteable, intermittently.  Rebooting the client seems to fix the issue temporarily with random disconnects.

shutting down winbind, smb and nmb, deleting all files in /var/lib/samba/, restarting nmb, smb and winbind, then running wbinfo -g and wbinfo -u seems to fix the problem.

sample smb.conf:

[global]
        workgroup = DOMAIN
        realm = DOMAIN.LOCAL
        server string = %h
        security = ADS
        allow trusted domains = No
        password server = zeus dione
        log file = /var/log/samba/%m
        smb ports = 445
        deadtime = 15
        load printers = No
        printcap name = cups
        local master = No
        domain master = No
        kernel oplocks = No
        idmap uid = 100000-200000
        idmap gid = 100000-200000
        template shell = /bin/bash
        winbind separator = +
        winbind enum users = Yes
        winbind enum groups = Yes
        winbind use default domain = Yes
        winbind expand groups = 5
        idmap config DOMAIN:range = 100000-200000
        idmap config DOMAIN:base_rid = 500
        idmap config DOMAIN:backend = rid
        admin users = "@DOMAIN+domain admins", DOMAIN+Administrator
        inherit owner = Yes
        kernel change notify = No
        use sendfile = Yes
        veto oplock files = /*.mdb/*.MDB/*.mde/*.MDE/*.accdb/*.ACCDB/*.ldb/*.LDB/
        browseable = No
        oplocks = No
        level2 oplocks = No

[Groups]
        comment = G_Groups on Bubbles
        path = /data/G_drive
        valid users = "@DOMAIN+domain admins", DOMAIN+Administrator, @DOMAIN+r_g_drive
        read only = No
        force create mode = 0770
        force directory mode = 0770
        inherit permissions = Yes
        inherit acls = Yes
        hide unreadable = Yes
Comment 1 andrew 2010-06-10 17:28:10 UTC
After two weeks of solid operation, this bug has returned (no reboot - so perhaps my previous assumption was incorrect - perhaps it is a time lapse issue - two weeks?).

additional observations:
- getent group does not return domain groups anymore, just local ones (it was working yesterday, but not today)
- wbinfo -g still returns domain groups
- other domain groups still resolve properly in getfacl and wbinfo - just domain users and domain admins is broken
- the following errors started appearing in the logs this morning, just when the domain users and domain admins resolution went screwy:

Jun 10 06:03:09 Bubbles winbindd[4263]: [2010/06/10 06:03:09.064013,  0] winbindd/idmap.c:201(smb_register_idmap_alloc)
Jun 10 06:03:09 Bubbles winbindd[4263]:   idmap_alloc module ldap already registered!
Jun 10 06:03:09 Bubbles winbindd[4263]: [2010/06/10 06:03:09.068660,  0] winbindd/idmap.c:201(smb_register_idmap_alloc)
Jun 10 06:03:09 Bubbles winbindd[4263]:   idmap_alloc module tdb already registered!
Jun 10 06:03:09 Bubbles winbindd[4263]: [2010/06/10 06:03:09.068695,  0] winbindd/idmap.c:149(smb_register_idmap)
Jun 10 06:03:09 Bubbles winbindd[4263]:   Idmap module passdb already registered!
Jun 10 06:03:09 Bubbles winbindd[4263]: [2010/06/10 06:03:09.068759,  0] winbindd/idmap.c:149(smb_register_idmap)
Jun 10 06:03:09 Bubbles winbindd[4263]:   Idmap module nss already registered!
Comment 2 andrew 2010-08-09 13:11:56 UTC
Samba was updated to sernet 3.5.4, both boxes fully patched as of July 3, 2010.

The above corruption still occurred on one box after several weeks, and necessitated a restart of samba to clear up.

The other box has not encountered any issues.

Users do not, however, seem to be encountering flaky share or file access even with Domain Users and Domain Admins not resolving properly in winbind.
Comment 3 andrew 2010-09-07 10:58:06 UTC
Started tracking dates.  Full samba flush was done on 2010/08/24.

----------------------------

getfacl on 08/30 returns:

getfacl: Removing leading '/' from absolute path names
# file: data/G_drive
# owner: root
# group: domain\040admins
user::rwx
group::rwx
group:domain\040admins:rwx
group:domain\040users:r-x
group:r_g_drive:r-x
mask::rwx
other::---

----------------------------

getfacl on 08/31 returns:

getfacl: Removing leading '/' from absolute path names
# file: data/G_drive
# owner: root
# group: BUILTIN+administrators
user::rwx
group::rwx
group:BUILTIN+administrators:rwx
group:BUILTIN+users:r-x
group:r_g_drive:r-x
mask::rwx
other::---
Comment 4 Michael Adam 2010-09-28 03:43:36 UTC
Andrew,

The range of idmap uid/gid must be DISJOINT from "idmap config DOMAIN:range".
This explains the faulty "reverse-resolution" of an unix id to the builtin
group instead of the DOMAIN group. g_drive probably just has no match in BUILTIN.
So it is not mapped back by the default range.

99% that this is the reason for your issue.

Could you modify the config correspondingly and try if the problem persists?

Cheers - Michael
Comment 5 andrew 2010-10-06 09:26:14 UTC
Thanks Michael - that seems to have fixed it.

Can I make a suggestion: testparm checks for many inconsistent conditions within smb.conf - perhaps it would be a good idea to get it to check for overlapping/conflicting UID and GID ranges.

I'm sure I'm not the the first person to make this mistake, and I probably won't be the last.
Comment 6 Björn Jacke 2014-07-24 07:23:19 UTC
testparm is not meant to make such high level checks.