Bug 9125 - The winbindd -n switch should not be obeyed for samlogon cache access
Summary: The winbindd -n switch should not be obeyed for samlogon cache access
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.0.0rc3
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks: 8622
  Show dependency treegraph
 
Reported: 2012-08-29 13:55 UTC by David Disseldorp
Modified: 2020-12-11 08:00 UTC (History)
7 users (show)

See Also:


Attachments
3-6-test patch (3.54 KB, patch)
2012-08-29 13:58 UTC, David Disseldorp
vl: review-
Details
Revert "s3-winbindd: make sure we obey the -n switch also for samlogon cache access." (3.79 KB, patch)
2012-11-06 11:57 UTC, David Disseldorp
kseeger: review? (gd)
asn: review+
Details
doc: describe samlogon cache caveat for winbindd -n (1.38 KB, patch)
2012-11-06 11:58 UTC, David Disseldorp
asn: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Disseldorp 2012-08-29 13:55:17 UTC
Günther has already fixed this bug in master with the commit: 

ae6a779bf9f816680e724ede37324b7f5355996b
Author: Günther Deschner <gd@samba.org>
Date:   Wed Jun 8 12:56:10 2011 +0200

    s3-winbindd: make sure we obey the -n switch also for samlogon cache access.


I've raised this bug in order to see the fix merged to the 3.6 branch.
Comment 1 David Disseldorp 2012-08-29 13:58:30 UTC
Created attachment 7847 [details]
3-6-test patch

The commit from master merges cleanly to the 3-6-test branch.
Comment 2 Volker Lendecke 2012-08-29 14:00:12 UTC
Comment on attachment 7847 [details]
3-6-test patch

I think this is a really, really bad idea. I missed the upstream patch, apologies for that.
Comment 3 David Disseldorp 2012-08-29 14:49:19 UTC
(In reply to comment #2)
> Comment on attachment 7847 [details]
> 3-6-test patch
> 
> I think this is a really, really bad idea. I missed the upstream patch,
> apologies for that.

Changing the summery accordingly, following discussion on IRC. This bug will now track the removal of this behaviour from master.
Comment 4 Guenther Deschner 2012-08-29 17:14:13 UTC
Oh well, the old behaviour of the -n mode was broken. 

It did not reliably avoid using the samlogon cache (which one most probably only ever will run for testing purposes) when it was requested. I know that the interrelation of some calls with the samlogon cache is subtle but I could not identify cases where the patch in master really broke these calls in -n mode.

I think it is important to have a way to consistently and reliably test direct network calls, bypassing possible caching effects, if that patch does something wrong we need to fix it.
Comment 5 Volker Lendecke 2012-08-29 19:47:01 UTC
(In reply to comment #4)
> Oh well, the old behaviour of the -n mode was broken. 
> 
> It did not reliably avoid using the samlogon cache (which one most probably
> only ever will run for testing purposes) when it was requested. I know that the
> interrelation of some calls with the samlogon cache is subtle but I could not
> identify cases where the patch in master really broke these calls in -n mode.
> 
> I think it is important to have a way to consistently and reliably test direct
> network calls, bypassing possible caching effects, if that patch does something
> wrong we need to fix it.

The problem is that there are no network calls possible at all that would do what the samlogon cache does for us. There is just no way to retrieve the group membership in a complex trusted environment. If you have just a single domain with Samba as domain controller it might be possible, but even within a single domain it is not possible to correctly retrieve all group memberships using LDAP calls due to ACLs on directory objects. The call to get that is called NetSamLogon on the NETLOGON pipe. But this call requires user credentials and might trigger updating counts on the server. So to correctly implement wbinfo -r after a user has logged in, you have two alternatives: Save the info3 struct or the PAC in the netsamlogon cache. If you insist on doing network calls, you need to cache the user credentials somewhere to re-do the NetSamLogon call every time the wbinfo -r is requested. Metze has once tested S4U2Self, but even that fails in a one-way trust scenario where a Kerberos login and a SamLogon call succeeds.

I can see your point, but doing away with the netsamlogon cache will make winbind give incomplete and wrong results in the majority of all our use cases.

I am happy to be proven wrong, but on irc Andrew Bartlett, who is pretty familiar with the authentication side of Active Directory, confirmed my view.

If you are willing to live with incorrect results, feel free to add a separate switch with a scary name like --i-do-really-want-to-test-without-netsamlogon-cache.
Comment 6 Volker Lendecke 2012-08-29 19:51:12 UTC
(In reply to comment #4)
> Oh well, the old behaviour of the -n mode was broken. 
> 
> It did not reliably avoid using the samlogon cache (which one most probably
> only ever will run for testing purposes) when it was requested. I know that the
> interrelation of some calls with the samlogon cache is subtle but I could not
> identify cases where the patch in master really broke these calls in -n mode.
> 
> I think it is important to have a way to consistently and reliably test direct
> network calls, bypassing possible caching effects, if that patch does something
> wrong we need to fix it.

Ah, if you want a reliable test case: Set up a one-way trust. Join the trusting domain with winbind. Log in as trusted user, member of some trusted groups. Try wbinfo -r after the wbinfo -a. Try that with and without the netsamlogon_cache.tdb.

Sorry to be so vocal about this, but that patch has serious security sensitive flaws that I would be very, very reluctant to deliberately put into a formal release.
Comment 7 Stefan Metzmacher 2012-10-22 13:50:07 UTC
I don't see why this was marked as blocker for 4.0.0
Comment 8 Volker Lendecke 2012-10-22 14:16:55 UTC
(In reply to comment #7)
> I don't see why this was marked as blocker for 4.0.0

Well, I see this as a blocker bug because this will result in invalid group memberships in the domain member case. It does not affect the Active Directory Domain Controller piece, sure, but the domain member will break with this patch still in.
Comment 9 Stefan Metzmacher 2012-10-22 14:21:27 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > I don't see why this was marked as blocker for 4.0.0
> 
> Well, I see this as a blocker bug because this will result in invalid group
> memberships in the domain member case. It does not affect the Active Directory
> Domain Controller piece, sure, but the domain member will break with this patch
> still in.

Ah, ok.

So the fix would be to revert the patch in master?
Comment 10 Volker Lendecke 2012-10-22 14:33:00 UTC
Either that or solve the problem that we currently do not have a way to retrieve the PAC information with different means. You are saying that this should not block 4.0, and I am not in the position to block anything in the way to 4.0 final. So I think you are just better informed than I am about the new ways this information can be retrieved in all the trust scenarios.
Comment 11 Volker Lendecke 2012-10-22 15:26:41 UTC
Günther, the original patch is yours. Can you comment why this is necessary for RedHat?

Thanks,

Volker
Comment 12 Michael Adam 2012-11-06 11:00:21 UTC
I think we should revert the patch to restore the old behaviour for now.

The desire to have a mode for skipping the netsamlogon cache is IMHO a nice-to-have thing for testing but should not block 4.0.

Comments? - Michael
Comment 13 David Disseldorp 2012-11-06 11:21:55 UTC
(In reply to comment #12)
> I think we should revert the patch to restore the old behaviour for now.

Agreed, with an additional change to the man page documenting the caveat.
Comment 14 David Disseldorp 2012-11-06 11:55:54 UTC
The following changes since commit 6073d214aa8bfeff8dae8cf151357f890dd37a48:

  ldb_secrets_tdb_sync: Add dependency on gssapi. (2012-11-06 05:12:28 +0100)

are available in the git repository at:

  git://git.samba.org/ddiss/samba.git bso9125-revert-with-doc

for you to fetch changes up to 4d9c0230f21802a6847d6995614d0f8ef51e2c63:

  doc: describe samlogon cache caveat for winbindd -n (2012-11-06 12:53:51 +0100)

----------------------------------------------------------------
David Disseldorp (2):
      Revert "s3-winbindd: make sure we obey the -n switch also for samlogon cache access."
      doc: describe samlogon cache caveat for winbindd -n

 docs-xml/manpages/winbindd.8.xml  |    6 ++++--
 source3/winbindd/winbindd_ads.c   |    2 +-
 source3/winbindd/winbindd_cache.c |    4 ----
 source3/winbindd/winbindd_creds.c |    4 ----
 source3/winbindd/winbindd_msrpc.c |    6 ++----
 5 files changed, 7 insertions(+), 15 deletions(-)
Comment 15 David Disseldorp 2012-11-06 11:57:38 UTC
Created attachment 8158 [details]
Revert "s3-winbindd: make sure we obey the -n switch also for samlogon cache access."
Comment 16 David Disseldorp 2012-11-06 11:58:45 UTC
Created attachment 8159 [details]
doc: describe samlogon cache caveat for winbindd -n
Comment 17 Volker Lendecke 2012-11-07 08:58:22 UTC
Comment on attachment 8158 [details]
Revert "s3-winbindd: make sure we obey the -n switch also for samlogon cache access."

I think Günther must be given a chance to comment here before we pull his code without consulting him.
Comment 18 Volker Lendecke 2012-11-07 08:58:34 UTC
Comment on attachment 8159 [details]
doc: describe samlogon cache caveat for winbindd -n

I think Günther must be given a chance to comment here before we pull his code without consulting him.
Comment 19 Andreas Schneider 2012-11-09 12:14:33 UTC
Alexander, Sumit or Simo, could you please comment on -n also disabling the samlogon cache. I think sssd does the caching here.
Comment 20 Volker Lendecke 2012-11-09 12:17:20 UTC
The question is -- does sssd also intercept the schannel connection the the NETLOGON pipe and also decode the PAC?
Comment 21 Michael Adam 2012-11-09 12:30:02 UTC
(In reply to comment #20)
> The question is -- does sssd also intercept the schannel connection the the
> NETLOGON pipe and also decode the PAC?

But this is the question only in order to understand RedHat's motivation. From the pure samba-perspective, "-n" was a means to run winbindd in a mode without using the winbindd_cache.tdb and the idmap cache, but still using the netsamlogon cache, hence staying correct and usable in member/trust situations (see comment #6).

For me "-n/--no-cache" reads as: disable winbindd's own caches. Don't disable the netsamlogon cache which is not exclusively owned by winbindd and can't be filled at will.

We have several options:

1. restore -n to its original behaviour (as suggested by the revert patch) and create a new option in the spirit of "--do-not-use-any-caches-at-all"

2. keep the "-n" at the "--do-not-use-any-caches-at-all" meaning and
   a. create a new option like "--disable-own-caches" or
   b. create several options for the various caches (winbindd_cache, idmap cache...) or
   c. create config options to turn on/off the various caches

I personally would prefer to not change the semantics of the "-n" options, since it has been around like this for years and might be in use in production in this meaning.

Cheers - Michael
Comment 22 Volker Lendecke 2012-11-09 12:36:32 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > The question is -- does sssd also intercept the schannel connection the the
> > NETLOGON pipe and also decode the PAC?
> 
> But this is the question only in order to understand RedHat's motivation. From
> the pure samba-perspective, "-n" was a means to run winbindd in a mode without
> using the winbindd_cache.tdb and the idmap cache, but still using the
> netsamlogon cache, hence staying correct and usable in member/trust situations
> (see comment #6).
> 
> For me "-n/--no-cache" reads as: disable winbindd's own caches. Don't disable
> the netsamlogon cache which is not exclusively owned by winbindd and can't be
> filled at will.
> 
> We have several options:
> 
> 1. restore -n to its original behaviour (as suggested by the revert patch) and
> create a new option in the spirit of "--do-not-use-any-caches-at-all"
> 
> 2. keep the "-n" at the "--do-not-use-any-caches-at-all" meaning and
>    a. create a new option like "--disable-own-caches" or
>    b. create several options for the various caches (winbindd_cache, idmap
> cache...) or
>    c. create config options to turn on/off the various caches
> 
> I personally would prefer to not change the semantics of the "-n" options,
> since it has been around like this for years and might be in use in production
> in this meaning.
> 
> Cheers - Michael

It might be that sssd brings its own libwbclient that intercepts the traffic to winbind (the PAM_AUTH_CRAP response). But this does not solve smbd directly saving the PAC in the netsamlogon_cache. I think the patch is incomplete in that we need to open up a channels from smbd into sssd to pass the PAC information that sssd can not see if we unpack it in smbd. One theoretical option is also that sssd bring its own Kerberos libs that pull the PAC information while smbd lets the kerberos libs decrypt the ticket.

All speculation, if possible someone from sssd should comment how it achieves the questions raised here.
Comment 23 Sumit Bose 2012-11-09 13:00:56 UTC
There was an old plan to use winbind with sssd but we dropped this some time ago. So there is no requirement from the sssd side here.
Comment 24 Volker Lendecke 2012-11-09 13:15:39 UTC
(In reply to comment #23)
> There was an old plan to use winbind with sssd but we dropped this some time
> ago. So there is no requirement from the sssd side here.

Does this mean that winbind is not used in RHEL anymore?
Comment 25 Michael Adam 2012-11-09 13:17:13 UTC
(In reply to comment #23)
> There was an old plan to use winbind with sssd but we dropped this some time
> ago. So there is no requirement from the sssd side here.

Ok, does this mean that we can safely restore the original meaning of "-n" (and use Karolin's patch to clarify the docs)?

We can still add a separate option to tell winbindd to not use the netsamlogon_cache if this is deemed useful.

Cheers - Michael

PS: Out of curiosity: If sssd does not use winbindd, does it do trust and membership relations with windows domains on its own, i.e. re-implement larger parts of what winbindd does for samba? Thinking this further, are you running samba (smbd) with sssd instead of winbindd?
Comment 26 Andreas Schneider 2012-11-09 13:22:27 UTC
@Michael: Yes, we can restore the old behavior.
Comment 27 Sumit Bose 2012-11-09 14:04:01 UTC
@Volker winbind is still used in RHEL

@Michael trust with sssd are work in progress, for group memberships tokenGroups are used. The plan is to use the global catalog as much as possible.
Comment 28 Andreas Schneider 2012-11-09 14:46:05 UTC
I've added Reviewed-by and pushed them to autobuild. Karolin please pick for v4-0-test.
Comment 29 Volker Lendecke 2012-11-09 14:59:28 UTC
(In reply to comment #27)
> @Volker winbind is still used in RHEL
> 
> @Michael trust with sssd are work in progress, for group memberships
> tokenGroups are used. The plan is to use the global catalog as much as
> possible.

Oh, that's great info! I did not know that there are LDAP queries possible that retrieve the group list. I understood the comments in winbind code such that this is not reliable. Can you give some pointers at the relevant documentation?

Thanks,

Volker
Comment 30 Karolin Seeger 2012-11-12 08:21:31 UTC
Pushed to autobuild-v4-0-test.
Comment 31 Karolin Seeger 2012-11-12 10:47:14 UTC
Pushed to v4-0-test.
Closing out bug report.

Thanks!
Comment 32 Matthieu Patou 2012-11-13 05:32:29 UTC
(In reply to comment #29)
> (In reply to comment #27)
> > @Volker winbind is still used in RHEL
> > 
> > @Michael trust with sssd are work in progress, for group memberships
> > tokenGroups are used. The plan is to use the global catalog as much as
> > possible.
> 
> Oh, that's great info! I did not know that there are LDAP queries possible that
> retrieve the group list. I understood the comments in winbind code such that
> this is not reliable. Can you give some pointers at the relevant documentation?
> 
I think I remember andrew bartlett saying that tokengroup is not the silver bullet for getting groups but I might be wrong.
Comment 33 Volker Lendecke 2012-11-13 05:37:14 UTC
(In reply to comment #32)
> I think I remember andrew bartlett saying that tokengroup is not the silver
> bullet for getting groups but I might be wrong.

That's exactly why I am so delighted that things might have changed. RedHat, please comment!
Comment 34 Sumit Bose 2012-11-13 07:53:49 UTC
Currently I'm not aware of any docs more specific than e.g. http://msdn.microsoft.com/en-us/library/cc223395(v=prot.20).aspx .

Günther mentioned, that tokenGroups might not be accessible if you are bind with a machine account. But at least this does not seem the case anymore at least with w2k8.  We got feedback from couple of sssd users that it is working well even in large productive environments.

If one you you is aware of any docs explaining the access control for the tokenGroups attribute I would be happy to learn about it too.