Bug 10194 - Offline logon cache not updating for cross child domain group membership
Summary: Offline logon cache not updating for cross child domain group membership
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 3.6.16
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-10 07:55 UTC by Andreas Schneider
Modified: 2013-10-15 07:02 UTC (History)
3 users (show)

See Also:


Attachments
Patch for master (9.89 KB, patch)
2013-10-10 08:34 UTC, Andreas Schneider
vl: review-
Details
Patch for master v2 (7.72 KB, patch)
2013-10-10 16:10 UTC, Andreas Schneider
vl: review-
Details
Patch for master v3 (7.72 KB, patch)
2013-10-10 18:15 UTC, Andreas Schneider
vl: review+
Details
v4-1-test patch (8.29 KB, patch)
2013-10-11 12:52 UTC, Andreas Schneider
vl: review+
Details
v4-0-test patch (8.30 KB, patch)
2013-10-11 12:52 UTC, Andreas Schneider
vl: review+
Details
v3-6-test patch (8.29 KB, patch)
2013-10-11 12:53 UTC, Andreas Schneider
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schneider 2013-10-10 07:55:40 UTC
The offline logon cache doesn't get updated for cross child domain group membership.

Details will follow.
Comment 1 Andreas Schneider 2013-10-10 08:33:22 UTC
Doamin Forest
==============

Consider you have the following domain with a DC for each.

               discworld.site
                /          \
               /            \
 level1.discworld.site   level2.discworld.site


Winbind is joined as a ADS member to LEVEL1.

Config
=======


[global]  
        workgroup = LEVEL1
        realm = LEVEL1.DISCWORLD.SITE
        security = ADS

        winbind separator = +
        winbind cache time = 30
        winbind enum users = Yes
        winbind enum groups = Yes
        winbind offline logon = Yes

        idmap config * : range = 1000000-1999999
        idmap config * : backend = tdb

        idmap config LEVEL1 : range = 100000000-199999999
        idmap config LEVEL1 : backend = rid

        idmap config LEVEL2 : range = 200000000-299999999
        idmap config LEVEL2 : backend = rid


Example of a reproducer:

We have a user:
LEVEL2+joe1

We have a domain local group:
LEVEL1+alicegroup_domlocal

joe1 is a group member of alicegroup_domlocal.


We remove all tdb files for all caches an start winbind!


asn@samba:~> ssh LEVEL2+joe1@localhost
LEVEL2+joe1@localhost's password:
Last login: Mon Oct  7 15:03:35 2013 from localhost
LEVEL2+joe1@samba:~> id
uid=200001104(LEVEL2+joe1) gid=200000513(LEVEL2+domain users) groups=200000513(LEVEL2+domain users),100001107(LEVEL1+alicegroup_domlocal)
LEVEL2+joe1@samba:~> logout

Now remove joe1 from alicegroup_domlocal
Wait 30 sec for cache expiration 

asn@samba:~> ssh LEVEL2+joe1@localhost
LEVEL2+joe1@localhost's password:
Last login: Mon Oct  7 15:03:35 2013 from localhost
LEVEL2+joe1@samba:~> id
uid=200001104(LEVEL2+joe1) gid=200000513(LEVEL2+domain users) groups=200000513(LEVEL2+domain users),100001107(LEVEL1+alicegroup_domlocal)


joe1 is still part of alicegroup_domlocal!!!




How does this happen?
======================

When joe1 logins in the first time, we get the information from the DCs and store them in the cache. That joe1 is a member of alicegroup1 is stored in the NDR cache by the winbind parent.

samba:~ # wbinfo --online-status
BUILTIN : online
SAMBA : online
LEVEL1 : online
DISCWORLD : offline
LEVEL2 : offline

If you check the online status of the domains you can see that the winbind parent thinks that LEVEL2 is offline. If you look at the code of source3/winbindd/winbindd_cache.c in the function wcache_fetch_ndr() you find the following code:

        if (!is_domain_offline(domain)) {
                uint32_t entry_seqnum, dom_seqnum, last_check;
                uint64_t entry_timeout;

                if (!wcache_fetch_seqnum(domain->name, &dom_seqnum,
                                         &last_check)) {
                        goto fail;
                }
                entry_seqnum = IVAL(data.dptr, 0);
                if (entry_seqnum != dom_seqnum) {
                        DEBUG(10, ("Entry has wrong sequence number: %d\n",
                                   (int)entry_seqnum));
                        goto fail;
                }
                entry_timeout = BVAL(data.dptr, 4);
                if (time(NULL) > entry_timeout) {
                        DEBUG(10, ("Entry has timed out\n"));
                        goto fail;
                }
        }

As the domain is marked is offline, the cache we got during the first connection will never expire. The domain is always offline and stays offline.

So the bug here is actually that a forked winbind child for a trusted domain never informs the parent that it is connected to the DC and working correctly.

If add more debug messages than you notice that there are more problems. If you issue a 'smbcontrol winbindd online' then the LEVEL1 child tries to connect to DISCWORLD and check if it can contact the DC. We really need to rewrite the code here and have much cleaner message flow.

However I will attach a patchset which is more or less a hack to fix this at least for now.
Comment 2 Andreas Schneider 2013-10-10 08:34:17 UTC
Created attachment 9267 [details]
Patch for master

If the patchset is fine for you, please add your review and push to master.
Comment 3 Volker Lendecke 2013-10-10 16:01:34 UTC
Comment on attachment 9267 [details]
Patch for master

As discussed on irc: winbind_parent_pid can be replaced by getppid() calls.
Comment 4 Andreas Schneider 2013-10-10 16:10:35 UTC
Created attachment 9271 [details]
Patch for master v2
Comment 5 Volker Lendecke 2013-10-10 18:02:53 UTC
Comment on attachment 9271 [details]
Patch for master v2

Next round: Please no DEBUG(0). This goes into syslog typically. Please increase the level to at least 3 or more.
Comment 6 Andreas Schneider 2013-10-10 18:15:11 UTC
Created attachment 9272 [details]
Patch for master v3
Comment 7 Andreas Schneider 2013-10-11 12:52:32 UTC
Created attachment 9275 [details]
v4-1-test patch
Comment 8 Andreas Schneider 2013-10-11 12:52:51 UTC
Created attachment 9276 [details]
v4-0-test patch
Comment 9 Andreas Schneider 2013-10-11 12:53:09 UTC
Created attachment 9277 [details]
v3-6-test patch
Comment 10 Volker Lendecke 2013-10-12 09:11:45 UTC
Andreas, now it's up to you to discuss with Karo where this goes :-)
Comment 11 Andreas Schneider 2013-10-12 15:41:27 UTC
Karolin, could you please add these patches to 4.1 and 4.0. Could you also add it to 3.6 as there are still patches waiting for a release and it would be great to have this one in too.
Comment 12 Karolin Seeger 2013-10-14 08:11:07 UTC
(In reply to comment #11)
> Karolin, could you please add these patches to 4.1 and 4.0. Could you also add
> it to 3.6 as there are still patches waiting for a release and it would be
> great to have this one in too.

Pushed to autobuild-v4-1-test, autobuild-v4-0-test and v3-6-test.
Comment 13 Karolin Seeger 2013-10-15 07:02:20 UTC
Pushed to v4-1-test and v4-0-test.
Closing out bug report.

Thanks!