Bug 5489 - Winbind in Samba 3.0.29 doesn't work on PDC
Summary: Winbind in Samba 3.0.29 doesn't work on PDC
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.2
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 3.2.3
Hardware: x86 Linux
: P3 normal
Target Milestone: ---
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
: 5499 5518 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-05-26 13:39 UTC by Alexander Bokovoy
Modified: 2009-05-08 07:06 UTC (History)
7 users (show)

See Also:


Attachments
logs and smb.conf (70.00 KB, application/octet-stream)
2008-05-26 13:42 UTC, Alexander Bokovoy
no flags Details
Patch (738 bytes, patch)
2008-05-28 18:41 UTC, Jeremy Allison
no flags Details
Patch (746 bytes, patch)
2008-05-28 18:42 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Bokovoy 2008-05-26 13:39:34 UTC
Winbind doesn't work on samba PDC setup.
Below is transcript and logs from bare setup:

[root@kolotun samba]# wbinfo -D boids
Name              : BOIDS
Alt_Name          :
SID               : S-1-5-21-1174443546-694929673-1347602479
Active Directory  : No
Native            : No
Primary           : Yes
Sequence          : -1
[root@kolotun samba]# smbpasswd -a root
New SMB password:
Retype new SMB password:
Added user root.
[root@kolotun samba]# smbpasswd -a crew
New SMB password:
Retype new SMB password:
Added user crew.
[root@kolotun samba]# smbclient -L kolotun -Ucrew%smb
Domain=[BOIDS] OS=[Unix] Server=[Samba 3.0.29-alt]

        Sharename       Type      Comment
        ---------       ----      -------
        netlogon        Disk
        profiles        Disk
        IPC$            IPC       IPC Service (Samba 3.0.29-alt)
Domain=[BOIDS] OS=[Unix] Server=[Samba 3.0.29-alt]

        Server               Comment
        ---------            -------
        KOLOTUN              Samba 3.0.29-alt

        Workgroup            Master
        ---------            -------
        BOIDS                KOLOTUN
[root@kolotun samba]# wbinfo -u
Error looking up domain users
[root@kolotun samba]# wbinfo -g
BUILTIN+administrators
BUILTIN+users
[root@kolotun samba]# wbinfo -t
checking the trust secret via RPC calls failed
error code was NT_STATUS_DOMAIN_CONTROLLER_NOT_FOUND (0xc0000233)
Could not check secret
Comment 1 Alexander Bokovoy 2008-05-26 13:42:23 UTC
Created attachment 3312 [details]
logs and smb.conf
Comment 2 Jeremy Allison 2008-05-28 18:41:14 UTC
Created attachment 3317 [details]
Patch

This should fix it I think. If we're running winbindd on a DC we need to contact our local smbd for auth and check secret, so that means the domain can't be marked internal.
Jeremy.
Comment 3 Jeremy Allison 2008-05-28 18:42:43 UTC
Created attachment 3318 [details]
Patch

Fix the comment in the earlier patch to make this clearer.
Jeremy.
Comment 4 Jeremy Allison 2008-05-28 19:29:30 UTC
Ok, I've tracked down the fix that broke this. Simo it was your diff here :

git-diff 83b04c60fac76ccd2d5aecb14f8896a07d488b1f..6e66512d5beb256a44c6703cdb8c7fa7e0fd8537

You hack the main winbindd to set the primary domain on a DC as "internal", then in the child try and hack it back. Problem is, once you've marked the domain internal then it never forks so you never get to your new code in the child. Did you ever test this ?

This means my patch isn't complete - I need to repair the code you added in the child winbindd, and I also need to understand why the main winbindd should never contact smbd.

Jeremy.
Comment 5 Jeremy Allison 2008-05-28 19:41:14 UTC
Nope, my mistake - that's not the problem with your patch. It will fork, but it's then does not connect to the local smbd. That patch is definately the problem, but I need to spend more time studying what it broke first.
Jeremy.
Comment 6 Simo Sorce 2008-05-29 00:06:30 UTC
(In reply to comment #5)
> Nope, my mistake - that's not the problem with your patch. It will fork, but
> it's then does not connect to the local smbd. That patch is definately the
> problem, but I need to spend more time studying what it broke first.
> Jeremy.

That patch fixed another problem, which is loops and deadlocks if we allow winbindd to contact the smbd (this was seen and analyzed in the wild).

The problem is that most of the code is built with the idea that you can always contact the domain controller, even if it is the local machine, and we do hacks like invalidating a netlogon connection just to make winbindd retry to connect and test if the trust password is ok.

We have a few options:
a) never allow winbindd to connect to smbd and handle all cases where we try to do so, test the IS_DC flag and, in that case handle the call in winbindd (winbindd can have access to anything smbd do have).

b) move all connections vs smbd into a child and make sure any connection back from smbd (direct or via nsswitch or any other indirect method) can never end up in the same child o rwe deadlock again there

c) keep stuff in the main daemon but make sure smbd never contacts back winbindd on a PDC.

I listed them in the order I prefer them, the last one being probably the path with the least success possibilities. 

Comment 7 Jeremy Allison 2008-05-29 10:38:58 UTC
*** Bug 5499 has been marked as a duplicate of this bug. ***
Comment 8 Jeremy Allison 2008-05-29 10:54:54 UTC
I'm sorry but I think we have to revert : 6e66512d5beb256a44c6703cdb8c7fa7e0fd8537, it simply breaks too much. If we get loops and deadlocks in the wild we have to analyise and fix them as detected. The current code simply breaks winbindd on a PDC completely.

I agree the correct fix is to move winbindd to do internal enumeration of it's primary domain when it's on a DC - I'll start coding this up. But in the meantime we have to revert this for 3.2 and 3.0.x.

Simo - do you have a list of any reference to bugs logged showing any loops ? That's a good place to start to examine problems.

Jeremy.
Comment 9 Simo Sorce 2008-05-29 11:53:29 UTC
Here there is part of the history that lead to this bugfix:
https://bugzilla.redhat.com/show_bug.cgi?id=429024

Need to fix this before reverting, or we just trade one issue with another.
Comment 10 Jeremy Allison 2008-05-29 12:31:40 UTC
This bug :

https://bugzilla.redhat.com/show_bug.cgi?id=429024

is a corner case. I agree we need to address it, but not at the cost of breaking all winbindd+PDC setups, which is what we have now.

Jeremy.
Comment 11 Alexander Bokovoy 2008-05-29 13:12:08 UTC
I'd keep this bug opened to track progress on winbind+PDC. I also have reports on winbind+PDC against 3.0.28 where similar simple setups were causing non-working behavior.
Comment 12 Simo Sorce 2008-05-29 13:25:37 UTC
(In reply to comment #10)
> This bug :
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=429024
> 
> is a corner case. I agree we need to address it, but not at the cost of
> breaking all winbindd+PDC setups, which is what we have now.

I am not sure that non-working trusts is a corner case, don't get fooled by the title of the bug, we discovered more breackage than just wbinfo -u


Comment 13 Jeremy Allison 2008-05-29 13:28:25 UTC
Non working trusts is all that is documented in the bug. If there is other breakage you need to document this.
Jeremy.
Comment 14 Simo Sorce 2008-05-29 13:45:20 UTC
No, just wanted to point out to go deep in the bug to bystanders.
Comment 15 Jeremy Allison 2008-05-30 11:08:20 UTC
FYI: I'm working on creating a "sam" backend in winbindd which will be separate from the builtin one. I'll check in soon.
Jeremy.
Comment 16 Simo Sorce 2008-05-30 14:11:35 UTC
Thanks Jermey, we tryied to lobby for something like that for some time a few years ago, I am glad you decided to proceed this way, I think it is the one that have more probabilities of success.
Comment 17 Alexander Bokovoy 2008-05-31 03:40:41 UTC
Tried 3-0-test with Jeremy's patches and it still doesn't work. Now 'wbinfo -u' properly shows users, 'wbinfo -t' doesn't work and ntlm_auth still doesn't work:

# ntlm_auth  --username=crew --password=crew
NT_STATUS_CANT_ACCESS_DOMAIN_INFO: NT_STATUS_CANT_ACCESS_DOMAIN_INFO (0xc00000da)

This is due to winbindd trying to do cm_connect_netlogon() to local winbind (which it successfully performs) and then failing get_trust_pw_hash() in winbindd_cm.c:2070:

      if (!get_trust_pw_hash(domain->name, mach_pwd, &account_name,
                               &sec_chan_type))
        {
                cli_rpc_pipe_close(netlogon_pipe);
                return NT_STATUS_CANT_ACCESS_DOMAIN_INFO;
        }

Unfortunately, I see nothing in the logs from get_trust_pw_hash() and get_trust_pw_clear().
Comment 18 Jeremy Allison 2008-05-31 12:21:40 UTC
I successfully made this work (wbinfo -t, wbinfo --authenticate=user%password) in the 3.0.x tree joined to a Samba PDC running on the same box.
Give more details please on what is failing for you. Have you joined winbindd to it's own (PDC) domain ? From the description of your error it looks like you haven't joined winbindd to the PDC.

Jeremy.
Comment 19 Alexander Bokovoy 2008-05-31 12:36:52 UTC
Jeremy, you're right. As I do clean tests but manually, I forgot to join PDC to its own domain and that wast the reason for failure.

I think this bug is fixed now. This absolutely needs to be ported to 3.2/3.3 and also please consider 3.0.30a release for near future.

Thanks!
Comment 20 Jeremy Allison 2008-06-03 19:49:18 UTC
*** Bug 5518 has been marked as a duplicate of this bug. ***
Comment 21 Dmitry Vagin 2008-10-09 05:42:39 UTC
Is it fixed already? It prevents us from upgrading Samba. We still use 3.0.28a. When it will be fixed?
Comment 22 Karolin Seeger 2008-10-09 05:55:39 UTC
Dmitry, as you can see in comment #19, everything is fine with Jeremy's patches and a proper join. It should be fixed in 3.0.31 and higher.

Closing out bug report.
Please re-open if it's still an issue.
Comment 23 Dmitry Vagin 2008-10-10 05:54:58 UTC
It isn't fixed. I tried to install 3.2.3.
Comment 24 Karolin Seeger 2008-10-10 06:02:58 UTC
Re-opening bug report, change to product 3.2 
Comment 25 Jeremy Allison 2008-10-10 12:12:52 UTC
Don't be so hasty to re-open. The reporter gave no details other than "it isn't fixed" :-). Remember Alexander had problems setting this up, let's get some details on his config first. This bug is *most definately* fixed. It's hard to set up, I'll grant that...
Jeremy.
Comment 26 Alexander Bokovoy 2008-10-10 18:47:52 UTC
This has been fixed in 3.0 for sure. I didn't check 3.2 in my environment though. I'll be able to check it only in a three to five days as I'm travelling.
Comment 27 Dmitry Vagin 2008-10-13 08:10:09 UTC
I use apt-get for installation on my Debian Etch box. It works fine on 3.0.28a. When I upgrade it using apt-get upgrade the authorization in squid is broken. The config:
[global]
   workgroup = XXX
   netbios name = XXX
   server string = %h server (Samba %v)
   wins support = yes
   dns proxy = no
   interfaces = eth3 lo
   hosts allow = 192.168.18. 127.
   log level = 5 auth:10 smb:10 idmap:10 rpc_parse:10
   log file = /var/log/samba/log.%m
   max log size = 1000
   syslog = 0
   panic action = /usr/share/samba/panic-action %d
   security = user
   encrypt passwords = true
   passdb backend = tdbsam
   obey pam restrictions = no
   invalid users = root
   admin users = ntadmin
   socket options = TCP_NODELAY

domain master = yes
local master = yes
preferred master = yes
os level = 255
domain logons = yes
logon path =
logon drive = h:
idmap uid = 500-10000000
idmap gid = 500-10000000
template shell = /bin/bash
winbind use default domain = yes
winbind enum users = yes
winbind enum groups = yes
password server = 192.168.18.127
client schannel = no
client use spnego = no
server signing = auto

[homes]
   comment = Home Directories
   browseable = no
   writable = yes
   create mask = 0644
   directory mask = 0755

[netlogon]
   comment = Network Logon Service
   path = /usr/lib/samba/netlogon
   guest ok = yes
   writable = no
   share modes = no
Comment 28 Debian samba package maintainers (PUBLIC MAILING LIST) 2009-04-04 07:51:07 UTC
Re-reading this bug report just makes me think that...it should be closed

--
Christian Perrier
Comment 29 Dmitry Vagin 2009-05-08 04:34:43 UTC
Upgraded to 3.3.4. It works fine now.
Comment 30 Karolin Seeger 2009-05-08 07:06:35 UTC
Closing out bug report.

Thanks for reporting!