Bug 15870 - intermittent winbind authentication failures: suspected incorrect nonce bitmask in get_cred.c
Summary: intermittent winbind authentication failures: suspected incorrect nonce bitma...
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.19.5
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-06-13 06:54 UTC by James Dingwall
Modified: 2025-08-15 09:17 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description James Dingwall 2025-06-13 06:54:37 UTC
This is a bug report relating to the following thread on samba@lists.samba.org: https://lists.samba.org/archive/samba/2025-June/251595.html.  The client is Ubuntu 24.04 with default Samba 4.19.5 packages joined to a Windows 2012 domain with two DCs.

The initial message reported in the winbind log was:

    ads_krb5_mk_req: smb_krb5_get_credentials failed for HOST$@DOMAIN.LOCAL (Preauthentication failed)

After tracing through the code it was found that when the random nonce in third_party/heimdal/lib/krb5/get_in_tkt.c had the highest 8 bits set then the request would fail, i.e. `nonce & 0xff000000 == 0xff000000` is true.  To confirm that the following change was made:

diff --git a/third_party/heimdal/lib/krb5/get_cred.c b/third_party/heimdal/lib/krb5/get_cred.c
index 6e48846bcb3..1317f33515d 100644
--- a/third_party/heimdal/lib/krb5/get_cred.c
+++ b/third_party/heimdal/lib/krb5/get_cred.c
@@ -551,6 +551,7 @@ get_cred_kdc(krb5_context context,
 
     krb5_generate_random_block(&nonce, sizeof(nonce));
     nonce &= 0xffffffff;
+    nonce |= 0x7fffffff;
 
     if(flags.b.enc_tkt_in_skey && second_ticket == NULL){
        ret = decode_Ticket(in_creds->second_ticket.data,

no authentication was possible via winbind in this case with the error reported for every case.

Checking in the code base the `nonce &= 0xffffffff;` bit mask appears unusual, in most other places it is 0x7ffffff (assuming the e vs f is for the purposes of the test script):

$ git grep nonce | grep ffffff
python/samba/tests/krb5/as_canonicalization_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/as_canonicalization_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/compatability_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/compatability_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/kdc_base_test.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/kdc_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/raw_testcase.py:        nonce_max = 0x7fffffff
python/samba/tests/krb5/s4u_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/s4u_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/s4u_tests.py:                                  nonce=0x7ffffffe,
python/samba/tests/krb5/simple_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/simple_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/simple_tests.py:                                  nonce=0x7ffffffe,
python/samba/tests/krb5/xrealm_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/xrealm_tests.py:                                 nonce=0x7fffffff,
python/samba/tests/krb5/xrealm_tests.py:                                  nonce=0x7ffffffe,
third_party/heimdal/lib/krb5/get_cred.c:    nonce &= 0xffffffff;
third_party/heimdal/lib/krb5/get_in_tkt.c:    nonce &= 0xffffffff;
third_party/heimdal/lib/krb5/init_creds_pw.c:    ctx->nonce &= 0x7fffffff;


Is it possible this change should be made (and possibly also relevant in get_in_tkt.c)?

diff --git a/third_party/heimdal/lib/krb5/get_cred.c b/third_party/heimdal/lib/krb5/get_cred.c
index 6e48846bcb3..81c1c42e1b1 100644
--- a/third_party/heimdal/lib/krb5/get_cred.c
+++ b/third_party/heimdal/lib/krb5/get_cred.c
@@ -550,7 +550,7 @@ get_cred_kdc(krb5_context context,
     padata.len = 0;
 
     krb5_generate_random_block(&nonce, sizeof(nonce));
-    nonce &= 0xffffffff;
+    nonce &= 0x7fffffff;
 
     if(flags.b.enc_tkt_in_skey && second_ticket == NULL){
        ret = decode_Ticket(in_creds->second_ticket.data,


If this should be filed upstream at https://github.com/heimdal/heimdal I can do that.
Comment 1 James Dingwall 2025-06-13 07:25:23 UTC
I did note that in third_party/heimdal/lib/asn1/pkinit.asn1 in some cases a nonce is signed and other unsigned but I've got a bit lost by this point trying to match what is happening against the documentation.  As RP reported being unable to reproduce this then perhaps there is a bug in an external library or something else about the Ubuntu build or our environment which is wrong.
Comment 2 James Dingwall 2025-06-13 13:58:29 UTC
The original test patch to prove the isssue was wrong, it should have looked like:

diff --git a/third_party/heimdal/lib/krb5/get_cred.c b/third_party/heimdal/lib/krb5/get_cred.c
index 6e48846bcb3..1317f33515d 100644
--- a/third_party/heimdal/lib/krb5/get_cred.c
+++ b/third_party/heimdal/lib/krb5/get_cred.c
@@ -551,6 +551,7 @@ get_cred_kdc(krb5_context context,
 
     krb5_generate_random_block(&nonce, sizeof(nonce));
     nonce &= 0xffffffff;
+    nonce |= 0xff000000;
 
     if(flags.b.enc_tkt_in_skey && second_ticket == NULL){
        ret = decode_Ticket(in_creds->second_ticket.data,
Comment 3 Rowland Penny 2025-06-14 07:58:19 UTC
(In reply to James Dingwall from comment #1)
No, I didn't say I couldn't reproduce it, I said that it didn't happen on my correctly set up Unix domain member, where the default domain '*' uses tdb and the 'DOMAIN' domain uses rid, unlike your set up that is using 'autorid' (with 'ignore builtin = yes') for the default domain and 'rid' for the 'DOMAIN' domain.
Comment 4 Rowland Penny 2025-06-15 09:39:36 UTC
(In reply to Rowland Penny from comment #3)
I now can say that I cannot reproduce this error, using the bug reporters smb.conf and a script based on his, it just works, potentially for ever. The big difference is that I used the latest Samba against Samba AD DCs.
Comment 5 David Fillingham 2025-08-10 10:06:36 UTC
I am experiencing this issue as well.
Linux domain member (Arch), samba & winbind version 4.22.3 connected to a windows domain.

Using Rowland Penny's script from the mailing list thread to test, results in it dying off around here:

```
TRY: 57
Sun 10 Aug 2025 19:59:48 AEST
[sudo] password for david: sudo success
TRY: 58
Sun 10 Aug 2025 19:59:49 AEST
[sudo] password for david: sudo success
TRY: 59
Sun 10 Aug 2025 19:59:50 AEST
[sudo] password for david: Sorry, try again.
[sudo] password for david:
sudo: no password was provided
sudo: 1 incorrect password attempt
```

My smb.conf is this:

```
[global]
    workgroup = FILLINGHAM
    security = ADS
    realm = AD.FILLINGHAM.AU

    winbind refresh tickets = Yes
    vfs objects = acl_xattr
    map acl inherit = Yes
    store dos attributes = Yes

    # Allow a single, unified keytab to store obtained Kerberos tickets
    dedicated keytab file = /etc/krb5.keytab
    kerberos method = secrets and keytab

    # Do not require that login usernames include the default domain
    winbind use default domain = yes
    winbind offline logon = yes

    # Default ID mapping configuration for local BUILTIN accounts
    # and groups on a domain member. The default (*) domain:
    # - must not overlap with any domain ID mapping configuration!
    # - must use a read-write-enabled back end, such as tdb.
    idmap config * : backend = tdb
    idmap config * : range = 10000-49999

    # Domain mappings
    idmap config FILLINGHAM : backend = rid
    idmap config FILLINGHAM : range = 65536-99999

    template shell = /bin/bash
    template homedir = /home/%U
```

When running winbindd with -d 3, I am able to see these lines when the failure occurs:

```
Aug 10 19:59:50 bigchungus winbindd[6256]: winbindd_interface_version: [PAM_WINBIND[sudo] (6799)]: request interface version (version = 33)
Aug 10 19:59:50 bigchungus winbindd[6256]: process_request_send: [PAM_WINBIND[sudo] (6799)] Handling async request: PAM_AUTH
Aug 10 19:59:50 bigchungus winbindd[6256]: [PAM_WINBIND[sudo] (6799)] Winbind external command PAM_AUTH start.
Aug 10 19:59:50 bigchungus winbindd[6256]: Authenticating user 'david'.
Aug 10 19:59:50 bigchungus winbindd[6261]: [6799]: dual pam auth FILLINGHAM\david
Aug 10 19:59:50 bigchungus winbindd[6261]: ads_krb5_mk_req: smb_krb5_get_credentials failed for BIGCHUNGUS$@AD.FILLINGHAM.AU (Preauthentication failed)
Aug 10 19:59:50 bigchungus winbindd[6261]: failed to get ticket for BIGCHUNGUS$@AD.FILLINGHAM.AU: Preauthentication failed
Aug 10 19:59:50 bigchungus winbindd[6261]: _wbint_PamAuth: Plain-text authentication for user FILLINGHAM\david returned NT_STATUS_LOGON_FAILURE (PAM: 7)
Aug 10 19:59:50 bigchungus winbindd[6261]: Auth: [winbind,PAM_AUTH, PAM_WINBIND[sudo], 6799] user [FILLINGHAM]\[david] at [Sun, 10 Aug 2025 19:59:50.861090 AEST] with [Plaintext] status [NT_STATUS_LOGON_FAILURE] workstation [(null)] remote host [unix:] mapped to [(null)]\[(null)]. local host [unix:]
Aug 10 19:59:50 bigchungus winbindd[6261]: {"timestamp": "2025-08-10T19:59:50.861114+1000", "type": "Authentication", "Authentication": {"version": {"major": 1, "minor": 3}, "eventId": 4625, "logonId": "1e3f9b664dc40386", "logonType": 8, "status": "NT_STATUS_LOGON_FAILURE", "localAddress": "unix:", "remoteAddress": "unix:", "serviceDescription": "winbind", "authDescription": "PAM_AUTH, PAM_WINBIND[sudo], 6799", "clientDomain": "FILLINGHAM", "clientAccount": "david", "workstation": null, "becameAccount": "", "becameDomain": "", "becameSid": null, "mappedAccount": null, "mappedDomain": null, "netlogonComputer": null, "netlogonTrustAccount": null, "netlogonNegotiateFlags": "0x00000000", "netlogonSecureChannelType": 0, "netlogonTrustAccountSid": null, "passwordType": "Plaintext", "clientPolicyAccessCheck": null, "serverPolicyAccessCheck": null, "duration": 25903}}
```
Comment 6 Rowland Penny 2025-08-15 09:17:19 UTC
(In reply to David Fillingham from comment #5)

I cannot get the script to fail against Samba AD DCs, I tried on Debian and Rocky Linux 9 and they both worked. I then set up an Arch Linux domain member and that works for myself.

What I did notice after comparing my log file with David's, his seems to be using 'dual pam auth' and kerberos when it fails, whereas, while 'dual pam auth' is in my logs, it doesn't seem to use kerberos.

I wonder if this is a PAM problem ?

Failing that, is would seem to be an interaction between Samba and Windows DCs.