Bug 15495 - Domain Join Fails With Samba to Domain Running in Server 2022 Insider Preview Build (25951)
Summary: Domain Join Fails With Samba to Domain Running in Server 2022 Insider Preview...
Status: RESOLVED INVALID
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.19.1
Hardware: All Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Andrew Bartlett
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-13 07:49 UTC by Akshatha Baliga
Modified: 2024-02-29 20:31 UTC (History)
3 users (show)

See Also:


Attachments
Domain Join Logs (64.13 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-10-13 07:49 UTC, Akshatha Baliga
no flags Details
Samba domain join logs (plain text) (133.01 KB, text/plain)
2023-10-17 09:03 UTC, Akshatha Baliga
no flags Details
ldbsearch Domain Name Logs (222.69 KB, text/plain)
2023-10-17 09:03 UTC, Akshatha Baliga
no flags Details
ldbsearch Domain Name Logs (227.36 KB, text/plain)
2023-10-17 10:38 UTC, Akshatha Baliga
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Akshatha Baliga 2023-10-13 07:49:13 UTC
Created attachment 18157 [details]
Domain Join Logs

Microsoft recently published Windows Server Insider (Build 25951) which has new improvements in the release in Active Directory Domain Services (AD DS) and Active Directory Lightweight Domain Services (AD LDS)

When we try to join a Linux machine running Samba to a domain running in Server 2022 Preview Build, the domain join fails with below error -
 
gensec_gse_unwrap: GSS UnWrap failed:  A token was invalid: unknown mech-code 0 for mech 1 2 840 113554 1 2 2
Failed while searching for: <WKGUID=AA312825768811D1ADED00C04FD8D5CD,dc=DC2022,dc=TEST>
libnet_DomainJoin: Failed to pre-create account in OU cn=Computers,dc=DC2022,dc=TEST: Time limit exceeded
 
The detailed log is attached for your reference. Here are the versions used for test
Linux Distro: Ubuntu 20.04.4 LTS
Samba: 4.13.17-Ubuntu (Also tested after upgrading to 4.19.1 which resulted in same error)
Command used to join to domain: net ads join -d 10 --no-dns-updates -U <user>%<pass> -k
smb.conf is in the attached log.

Could you please get a fix for this issue?
Comment 1 Andrew Bartlett 2023-10-17 02:37:43 UTC
Thanks for the heads up about the pre-release windows changes. 

Please provide any logs in plain text format.

Are you able to use ldbsearch to search this domain with --use-kerberos set?
Comment 2 Akshatha Baliga 2023-10-17 09:03:20 UTC
Created attachment 18162 [details]
Samba domain join logs (plain text)

Samba domain join logs (plain text)
Comment 3 Akshatha Baliga 2023-10-17 09:03:55 UTC
Created attachment 18163 [details]
ldbsearch Domain Name Logs

ldbsearch Domain Name Logs
Comment 4 Akshatha Baliga 2023-10-17 10:38:02 UTC
Created attachment 18164 [details]
ldbsearch Domain Name Logs
Comment 5 Akshatha Baliga 2023-10-17 10:41:40 UTC
Thank you Andrew for taking a look at this issue. I have re-attached the Domain Join logs in plain-text format. Have also attached the ldbsearch output. With kerberos=No the ldap search works file but with Kerberos=Yes there is NT_STATUS_INVALID_PARAMETER seen (More in the attachment) 

Kindly check and let me know if any further information is needed.
Comment 6 Andrew Bartlett 2023-10-17 18:54:49 UTC
I don't actually need to see the contents of your domain, just if the ldbsearch works or not.

But it must be with --use-kerberos=required, (sorry, I forgot to look up the syntax) which means specifying the full DNS name to the -H parameter. 

Run both:

ldbsearch -H ldap://dns.name.of.dc -Uadministrator --use-kerberos=required --client-protection=encrypt -s base

ldbsearch -H ldap://dns.name.of.dc -Uadministrator --use-kerberos=required --client-protection=sign -s base

to get us that useful info

What I'm trying to show is the difference between the OpenLDAP client libs used in the libads code, and our internal ldap client in libcli/ldap because if there is a difference, that makes it easier for us to fix these things.

I'm not likely to be the person working to fix this long-term, but I wanted to kickstart the process by getting some detail.
Comment 7 Akshatha Baliga 2023-10-18 08:49:25 UTC
Hi Andrew

ldbsearch with --kerberos=Yes fails where as --kerberos=No works! Also, I don't see any option client-protection in the ldapsearch tool but tried by passing -encrypt and -sign options and both work with --kerberos=No and fail with --kerberos=Yes

root@host~# ldbsearch -H ldap://<DC's DNS Name> -b '<domain>' -U <user> --kerberos=Yes --signing=on
Password for [<domain>\<user>]:
Failed to bind - LDAP client internal error: NT_STATUS_INVALID_PARAMETER
Failed to connect to 'ldap://<DC>' with backend 'ldap': LDAP client internal error: NT_STATUS_INVALID_PARAMETER
Failed to connect to ldap://<DC> - LDAP client internal error: NT_STATUS_INVALID_PARAMETER

root@host:~# ldbsearch -H ldap://<DC's DNS Name> -b '<domain>' -U <user> --kerberos=Yes --encrypt
Password for [<domain>\<user<]:
Failed to bind - LDAP client internal error: NT_STATUS_INVALID_PARAMETER
Failed to connect to 'ldap://<DC>' with backend 'ldap': LDAP client internal error: NT_STATUS_INVALID_PARAMETER
Failed to connect to ldap://<DC> - LDAP client internal error: NT_STATUS_INVALID_PARAMETER
root@host:~#

root@host:~# ldbsearch -H ldap://<DC's DNS Name> -b '<domain>' -U <user> --kerberos=No
<Dumps all the search results successfully>
Comment 8 Akshatha Baliga 2023-10-18 12:39:51 UTC
Wanted to share this info if it helps in debugging. I see that the Time Limit Exceeded error comes due the fix added in https://attachments.samba.org/attachment.cgi?id=4075

The OpenLDAP API - ldap_search_ext_s returns -1 and the if condition in above snippet overwrites it with LDAP_TIMELIMIT_EXCEEDED
Comment 9 Andrew Bartlett 2023-10-19 19:33:27 UTC
Thanks.  That detail brings this into my area of responsibility (I work on the AD DC and the associated tooling, including ldbsearch), so I'll spend some time on this soon.
Comment 10 Andrew Bartlett 2023-10-19 19:34:47 UTC
(In reply to Akshatha Baliga from comment #7)
Can you confirm that the DC name you specify in the -H parameter is a registered DNS name of a specific DC (not the domain name)?
Comment 11 Akshatha Baliga 2023-10-20 09:06:50 UTC
Actually, the DC I was using had a different DNS registered name!! Sorry for the confusion. I tested this again with the correct name and ldbsearch also gives the same error (of unknown mech-code) as that we get during net ads join - as follows -


root@<host<:/# ldbsearch -H ldap://<DC DNS Name> -b 'DC=DC2022,DC=TEST' -U <user> --kerberos=Yes --signing=on -d 10
<snipped>
 subreq: 0x56052d68b9b0
gensec_update_send: spnego[0x56052d686ab0]: subreq: 0x56052d68de30
gensec_update_done: gssapi_krb5[0x56052d67a330]: NT_STATUS_MORE_PROCESSING_REQUIRED tevent_req[0x56052d68b9b0/../../source4/auth/gensec/gensec_gssapi.c:1056]: state[2] error[0 (0x0)]  state[struct gensec_gssapi_update_state (0x56052d68bb60)] timer[(nil)] finish[../../source4/auth/gensec/gensec_gssapi.c:1067]
gensec_update_done: spnego[0x56052d686ab0]: NT_STATUS_MORE_PROCESSING_REQUIRED tevent_req[0x56052d68de30/../../auth/gensec/spnego.c:1631]: state[2] error[0 (0x0)]  state[struct gensec_spnego_update_state (0x56052d68dfe0)] timer[(nil)] finish[../../auth/gensec/spnego.c:2116]
gensec_gssapi: NO credentials were delegated
GSSAPI Connection will be cryptographically signed
gensec_update_send: gssapi_krb5[0x56052d67a330]: subreq: 0x56052d68b9b0
gensec_update_send: spnego[0x56052d686ab0]: subreq: 0x56052d68de30
gensec_update_done: gssapi_krb5[0x56052d67a330]: NT_STATUS_OK tevent_req[0x56052d68b9b0/../../source4/auth/gensec/gensec_gssapi.c:1056]: state[2] error[0 (0x0)]  state[struct gensec_gssapi_update_state (0x56052d68bb60)] timer[(nil)] finish[../../source4/auth/gensec/gensec_gssapi.c:1074]
gensec_update_done: spnego[0x56052d686ab0]: NT_STATUS_OK tevent_req[0x56052d68de30/../../auth/gensec/spnego.c:1631]: state[2] error[0 (0x0)]  state[struct gensec_spnego_update_state (0x56052d68dfe0)] timer[(nil)] finish[../../auth/gensec/spnego.c:2116]
gensec_gssapi_unwrap: GSS UnWrap failed:  A token was invalid: unknown mech-code 0 for mech 1 2 840 113554 1 2 2
search failed - connection to remote LDAP server dropped?
root@<host>:/#
Comment 12 Akshatha Baliga 2023-10-25 12:11:02 UTC
Hello Andrew

Do you have any opinion about the above issue? Any tips or workaround to get the domain join to work on Ubuntu with Kerberos?

Thank you in advance
Akshatha
Comment 13 Andrew Bartlett 2023-10-25 21:13:40 UTC
(In reply to Akshatha Baliga from comment #12)
Thanks for asking.

From here, as far as I can see, this will require developer time, starting by reproducing the behaviour locally.  

We should work with Microsoft to see if we can get the server behaviour change reverted (as it appears, from your notes, to impact existing deployed clients), and regardless we should work with them to understand what exactly changed why (it may be part of tightening up the security posture).

I don't have any particular suggestions about any workarounds, except to ensure you are not using arcfour-hmac-md5 in any way.

Sorry I don't have better news.

Andrew Bartlett
Comment 14 Stefan Metzmacher 2023-11-14 09:36:28 UTC
The name for 10.5.139.71 is WIN-841PV4HEK0O.dc2022.test, why are you using
oak-vcs1288.dc2022.test as a name instead, it's clear that 
ldap/oak-vcs1288.dc2022.test@DC2022.TEST is not a valid service...

That's why https://bugzilla.samba.org/attachment.cgi?id=18164 doesn't show the
same problem than https://bugzilla.samba.org/attachment.cgi?id=18162

Along with the net ads join -d100 and/or ldbsearch -d100 outputs, we need a network capture (at best together with a keytab that contains every password(hash) of the whole domain, generated by 'net rpc vampire keytab -I 10.5.139.71 -Uadministrator /path/to/keytab', NOTE this must be a testonly domain without any confidential passwords in use!)
Comment 15 Andrew Bartlett 2024-02-29 20:31:11 UTC
My understanding is that Microsoft did not keep the changed behaviour in later pre-releases, so this problem has gone away.