Bug 14674 - net ads info shows LDAP Server: 0.0.0.0 depending on contacted server
Summary: net ads info shows LDAP Server: 0.0.0.0 depending on contacted server
Status: ASSIGNED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Tools (show other bugs)
Version: 4.14.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-18 18:31 UTC by Peter Eriksson
Modified: 2022-05-24 18:04 UTC (History)
3 users (show)

See Also:


Attachments
Output from "net ads info" (69.26 KB, text/plain)
2021-03-18 18:41 UTC, Peter Eriksson
no flags Details
Debug printf output from a run that is OK (4.25 KB, text/plain)
2021-03-18 22:28 UTC, Peter Eriksson
no flags Details
Debug printf output from a run that fails (4.85 KB, text/plain)
2021-03-18 22:29 UTC, Peter Eriksson
no flags Details
A more detailed printf-debug failed output (8.45 KB, text/plain)
2021-03-19 23:20 UTC, Peter Eriksson
no flags Details
Untested "raw" patch. (516 bytes, patch)
2022-01-28 18:31 UTC, Jeremy Allison
no flags Details
Untested "raw" patch that actually compiles :-). (518 bytes, patch)
2022-01-28 18:38 UTC, Jeremy Allison
no flags Details
git-am fix for 4.16.rcNext, 4.15.next, 4.14.next (3.18 KB, patch)
2022-02-08 22:21 UTC, Jeremy Allison
slow: review+
Details
Patch for 4.14 next, 4.15 next and 4.16 rc (2.97 KB, patch)
2022-05-24 09:21 UTC, Samuel Cabrero
jra: review+
slow: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2021-03-18 18:31:12 UTC
Our (Microsoft AD) has 6 AD DC servers, lkpdc1, lkpdc2, lkpdc3-2019, lkpdc4-2019, nkpdc1-2019 & nkpdc2-2019.

Using "net ads info" and contacting lkpdc1 gives expected results all the time:

# net ads info -S lkpdc1.ad.liu.se
LDAP server: 130.236.30.35
LDAP server name: lkpdc1.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: tors, 18 mars 2021 19:26:25 CET
KDC server: 130.236.30.35
Server time offset: 0
Last machine account password change: tors, 18 mars 2021 17:01:28 CET


Contacting any of the others give the right result the first time, and then strange output the next times:

# net ads info -S lkpdc4-2019.ad.liu.se
LDAP server: 130.236.30.35
LDAP server name: lkpdc1.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: tors, 18 mars 2021 19:28:07 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 18 mars 2021 17:01:28 CET

# net ads info -S lkpdc4-2019.ad.liu.se
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: tors, 18 mars 2021 19:28:10 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 01 jan. 1970 01:00:00 CET


Most things _seem_ to work anyway but this is still a bit annoying...
Comment 1 Jeremy Allison 2021-03-18 18:38:11 UTC
Can you get me a debug level 10 client trace log of this ?

Thanks.
Comment 2 Peter Eriksson 2021-03-18 18:41:03 UTC
Created attachment 16558 [details]
Output from "net ads info"

Here's the output from "net ads info -d10 -S lkpdc4-2019.ad.liu.se".
Comment 3 Peter Eriksson 2021-03-18 18:48:11 UTC
Don't know if it's related or not but while (so far fruitless) trying to trace down this issue, I noticed that:

_tsocket_address_inet_from_string() get's called with an empty "addr" argument that doesn't look quite right. The code looks for addr being NULL and then assigns it to "127.0.0.1" or "::1" so I tried adding an "if (addr && !*addr) addr=NULL;" but it doesn't make any difference.

Below are my added printf/debug outputs (130.236.30.34 = lkpdc4-2019)


With addr=NULL assignment in _tsocket_address_inet_from_string:

# net ads info -S lkpdc4-2019.ad.liu.se
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: tors, 18 mars 2021 19:42:01 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 01 jan. 1970 01:00:00 CET



Without:

# net ads info -S lkpdc4-2019.ad.liu.se
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.35, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.35, realm=<null>
LDAP server: 130.236.30.35
LDAP server name: lkpdc1.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: tors, 18 mars 2021 19:46:11 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 18 mars 2021 17:01:28 CET


# net ads info -S lkpdc4-2019.ad.liu.se
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: tors, 18 mars 2021 19:46:13 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 01 jan. 1970 01:00:00 CET
Comment 4 Jeremy Allison 2021-03-18 19:22:09 UTC
OK:

577         d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name);

will allways print (null) without -S as "ads->config.ldap_server_name" is never set anywhere by discovery.

That's just a bug. Should be doing:

if (ads->config.ldap_server_name != NULL) {
             d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name);
} else {
             d_printf(_("LDAP server name: %s\n"), ads->server.ldap_server);
}
Comment 5 Jeremy Allison 2021-03-18 19:25:40 UTC
The curious issue is the address being printed as 0.0.0.0:

574         print_sockaddr(addr, sizeof(addr), &ads->ldap.ss);
575 
576         d_printf(_("LDAP server: %s\n"), addr);


Can you add me some debugs :

 561         /* Try to set the server's current time since we didn't do a full
 562            TCP LDAP session initially */
 563


-----------> Here add 

             print_sockaddr(addr, sizeof(addr), &ads->ldap.ss);
             d_printf(_("LDAP server: %s\n"), addr);

as I'd like to know if the call to ads_current_time() is screwing over
the ads->ldap.ss value.

 564         if ( !ADS_ERR_OK(ads_current_time( ads )) ) {
 565                 d_fprintf( stderr, _("Failed to get server's current time!\n"));
 566         }
 567 
 568         if (c->opt_json) {
 569                 return net_ads_info_json(ads);
 570         }
 571 
 572         pass_time = secrets_fetch_pass_last_set_time(ads->server.workgroup);
 573 
 574         print_sockaddr(addr, sizeof(addr), &ads->ldap.ss);
 575 
 576         d_printf(_("LDAP server: %s\n"), addr);
 577         d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name);
 578         d_printf(_("Realm: %s\n"), ads->config.realm);
 579         d_printf(_("Bind Path: %s\n"), ads->config.bind_path);
 580         d_printf(_("LDAP port: %d\n"), ads->ldap.port);
 581         d_printf(_("Server time: %s\n"),
 582                          http_timestring(talloc_tos(), ads->config.current_time));
Comment 6 Jeremy Allison 2021-03-18 19:27:46 UTC
You could also go into this via gdb and get the same info.

Bottom line, using -S <server> bypasses discovery, so everything works.

Without -S <server> we use discovery, so ads->config.ldap_server_name doesn't get set, only ads->server.ldap_server.

But for success, ads->ldap.ss should be correctly set to the IP addr, so it'd be good to see where that's getting trampled on.
Comment 7 Peter Eriksson 2021-03-18 19:37:04 UTC
"net ads info" sometimes gives the same wrong info, I suppose it depends on which server it chooses.


And yep, it looks like the ads_current_time calls is the culprit.
(Some other debug printouts of mine there too, ignore those :-)

# net ads info -S lkpdc4-2019.ad.liu.se
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
XXX Before ads_current_time: LDAP server: 130.236.30.34
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
XXX After ads_current_time: LDAP server: 0.0.0.0
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: tors, 18 mars 2021 20:35:25 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 01 jan. 1970 01:00:00 CET
Comment 8 Jeremy Allison 2021-03-18 19:37:28 UTC
Hmmm. I see you're always using -S name, so what I posted doesn't make sense :-).

We need to investigate this more carefully...

If the ads_current_time() is sabotaging the ads struct (I rue the day people decided to re-use these damn things) then that would explain it.
Comment 9 Peter Eriksson 2021-03-18 19:39:14 UTC
     /* Try to set the server's current time since we didn't do a full                                                                                             
           TCP LDAP session initially */

        print_sockaddr(addr, sizeof(addr), &ads->ldap.ss);
        d_printf("XXX Before ads_current_time: LDAP server: %s\n", addr);

        if ( !ADS_ERR_OK(ads_current_time( ads )) ) {
                d_fprintf( stderr, _("Failed to get server's current time!\n"));
        }

        print_sockaddr(addr, sizeof(addr), &ads->ldap.ss);
        d_printf("XXX After ads_current_time: LDAP server: %s\n", addr);
Comment 10 Jeremy Allison 2021-03-18 19:41:26 UTC
So this is the code that can trample on the existing ads struct inside ads_current_time().

3283         /* establish a new ldap tcp session if necessary */
3284 
3285         if ( !ads->ldap.ld ) {
3286                 /*
3287                  * ADS_STRUCT may be being reused after a
3288                  * DC lookup, so ads->ldap.ss may already have a
3289                  * good address. If not, re-initialize the passed-in
3290                  * ADS_STRUCT with the given server.XXXX parameters.
3291                  *
3292                  * Note that this doesn't depend on
3293                  * ads->server.ldap_server != NULL,
3294                  * as the case where ads->server.ldap_server==NULL and
3295                  * ads->ldap.ss != zero_address is precisely the DC
3296                  * lookup case where ads->ldap.ss was found by going
3297                  * through ads_find_dc() again we want to avoid repeating.
3298                  */
3299                 if (is_zero_addr(&ads->ldap.ss)) {
3300                         ads_s = ads_init(ads->server.realm,
3301                                          ads->server.workgroup,
3302                                          ads->server.ldap_server,
3303                                          ADS_SASL_PLAIN );
3304                         if (ads_s == NULL) {
3305                                 status = ADS_ERROR(LDAP_NO_MEMORY);
3306                                 goto done;
3307                         }
3308                 }
3309                 ads_s->auth.flags = ADS_AUTH_ANON_BIND;
3310                 status = ads_connect( ads_s );
3311                 if ( !ADS_ERR_OK(status))
3312                         goto done;
3313         }

Can you dump out the status of ads->ldap.ld. It's strange, because even if ads->ldap.ld == NULL, line 3299 checks that ads->ldap.ss is a zero addr before trampling on the struct.
Comment 11 Jeremy Allison 2021-03-18 19:46:20 UTC
Oh, I found it.

It's here:

3277         ADS_STRUCT *ads_s = ads;
...
3285         if ( !ads->ldap.ld ) {
3286                 /*
3287                  * ADS_STRUCT may be being reused after a
3288                  * DC lookup, so ads->ldap.ss may already have a
3289                  * good address. If not, re-initialize the passed-in
3290                  * ADS_STRUCT with the given server.XXXX parameters.
3291                  *
3292                  * Note that this doesn't depend on
3293                  * ads->server.ldap_server != NULL,
3294                  * as the case where ads->server.ldap_server==NULL and
3295                  * ads->ldap.ss != zero_address is precisely the DC
3296                  * lookup case where ads->ldap.ss was found by going
3297                  * through ads_find_dc() again we want to avoid repeating.
3298                  */
3299                 if (is_zero_addr(&ads->ldap.ss)) {
3300                         ads_s = ads_init(ads->server.realm,
3301                                          ads->server.workgroup,
3302                                          ads->server.ldap_server,
3303                                          ADS_SASL_PLAIN );
3304                         if (ads_s == NULL) {
3305                                 status = ADS_ERROR(LDAP_NO_MEMORY);
3306                                 goto done;
3307                         }
3308                 }
3309                 ads_s->auth.flags = ADS_AUTH_ANON_BIND;
3310                 status = ads_connect( ads_s );
3311                 if ( !ADS_ERR_OK(status))
3312                         goto done;
3313         }

The ads_connect( ads_s ) call at line 3310 whacks the existing ads struct at line 637 inside ads_connect().

 637         ads_zero_ldap(ads);
Comment 12 Jeremy Allison 2021-03-18 19:47:02 UTC
So that's forcing it to go through discovery.
Comment 13 Jeremy Allison 2021-03-18 19:57:30 UTC
Can you step into:

ads_current_time()
  -> ads_connect()

and let me know what code paths it's taking in there ? This is fiendishly over complicated for what it's doing and without walking through it I just can't see exactly what is happening here :-(.
Comment 14 Jeremy Allison 2021-03-18 20:00:22 UTC
FYI, I'm planning to give a talk on the fixes I did to the DC discovery code for ChromeOS at SambaXP ("Your Server will be with you shortly.." :-), in case you're interested. It's going to be a free conference.

Feel free to broadcast widely in your org :-).
Comment 15 Peter Eriksson 2021-03-18 20:07:41 UTC
# net ads info -S lkpdc4-2019.ad.liu.se
ads_connect: start: ads=99b750
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_connect: start: ads=99d500
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
XXX Before ads_current_time: LDAP server: 130.236.30.34
ads_current_time: Start: ads->ldap.ld = 0
ads_connect: start: ads=99d500
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_connect: start: ads=99d750
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_current_time: Start: ads->ldap.ld = 0
ads_current_time: is_zero_addr is true
ads_connect: start: ads=9a45a0
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
ads_current_time: Start: ads->ldap.ld = 96a720
XXX After ads_current_time: LDAP server: 0.0.0.0
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: tors, 18 mars 2021 21:04:23 CET
KDC server: 130.236.30.34
Server time offset: 0
Last machine account password change: tors, 01 jan. 1970 01:00:00 CET


I'll do some gdb debugging/stepping...
Comment 16 Peter Eriksson 2021-03-18 20:41:00 UTC
Thread 2 hit Breakpoint 1, ads_current_time (ads=0x1314500) at ../../source3/libads/ldap.c:3274
3274            const char *attrs[] = {"currentTime", NULL};
...
gdb) print *ads
$1 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 66, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", 
    ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", 
    current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}}

(gdb) n
3279            ADS_STRUCT *ads_s = ads;
(gdb) n
3281            if (!(ctx = talloc_init("ads_current_time"))) {
(gdb) n
3287            fprintf(stderr, "ads_current_time: Start: ads->ldap.ld = %p\n", ads->ldap.ld);
(gdb) list
3282                    return ADS_ERROR(LDAP_NO_MEMORY);
3283            }
3284    
3285            /* establish a new ldap tcp session if necessary */
3286    
3287            fprintf(stderr, "ads_current_time: Start: ads->ldap.ld = %p\n", ads->ldap.ld);
3288            if ( !ads->ldap.ld ) {
3289                    /*
3290                     * ADS_STRUCT may be being reused after a
3291                     * DC lookup, so ads->ldap.ss may already have a
(gdb) n
ads_current_time: Start: ads->ldap.ld = 0
3288            if ( !ads->ldap.ld ) {
(gdb) n
3302                    if (is_zero_addr(&ads->ldap.ss)) {
(gdb) n
3314                    ads_s->auth.flags = ADS_AUTH_ANON_BIND;
(gdb) list
3309                            if (ads_s == NULL) {
3310                                    status = ADS_ERROR(LDAP_NO_MEMORY);
3311                                    goto done;
3312                            }
3313                    }
3314                    ads_s->auth.flags = ADS_AUTH_ANON_BIND;
3315                    status = ads_connect( ads_s );
3316                    if ( !ADS_ERR_OK(status))
3317                            goto done;
3318            }
(gdb) n
3315                    status = ads_connect( ads_s );
(gdb) print ads
$2 = (ADS_STRUCT *) 0x1314500
(gdb) print ads_s
$3 = (ADS_STRUCT *) 0x1314500

(gdb) print *ads
$4 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", 
    ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", 
    current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}}
(gdb) print *ads_s
$5 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", 
    ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", 
    current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}}
(gdb) s
ads_connect (ads=0x1314500) at ../../source3/libads/ldap.c:604
604             int version = LDAP_VERSION3;
(gdb) list
599      * @param ads Pointer to an existing ADS_STRUCT
600      * @return status of connection
601      **/
602     ADS_STATUS ads_connect(ADS_STRUCT *ads)
603     {
604             int version = LDAP_VERSION3;
605             ADS_STATUS status;
606             NTSTATUS ntstatus;
607             char addr[INET6_ADDRSTRLEN];
608             struct samba_sockaddr existing_sa = {0};
(gdb) n
608             struct samba_sockaddr existing_sa = {0};
(gdb) n
610             fprintf(stderr, "ads_connect: start: ads=%p\n", ads);
(gdb) n
ads_connect: start: ads=1314500
630             if (ads->server.ldap_server == NULL && !is_zero_addr(&ads->ldap.ss)) {
(gdb) print *ads
$6 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", 
    ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", 
    current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}}
(gdb) n
639             ads_zero_ldap(ads);
gdb) n
640             ZERO_STRUCT(ads->ldap_wrap_data);
(gdb) print *ads
$7 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", 
    ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", 
    current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\000\000\000", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 0, port = 0}}

(gdb) n
641             ads->ldap.last_attempt  = time_mono(NULL);
(gdb) n
642             ads->ldap_wrap_data.wrap_type   = ADS_SASLWRAP_TYPE_PLAIN;
(gdb) n
646             if (DEBUGLEVEL >= 11) {
(gdb) n
653             if (ads->server.ldap_server) {
(gdb) n
654                     bool ok = false;
(gdb) n
657                     ok = resolve_name(ads->server.ldap_server, &ss, 0x20, true);
(gdb) n
658                     if (!ok) {
(gdb) n
664                     ok = ads_try_connect(ads, ads->server.gc, &ss);
(gdb) n
ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null>
665                     if (ok) {
(gdb) n
674                     if (ads->server.gc == true) {
(gdb) n
678                     if (ads->server.no_fallback) {
(gdb) n
684             if (!is_zero_addr(&existing_sa.u.ss)) {
(gdb) n
687                                               ads->server.gc,
(gdb) n
686                     bool ok = ads_try_connect(ads,
(gdb) n
ads_cldap_netlogon_5: Start: ss=, realm=<null>
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
689                     if (ok) {
(gdb) n
698             ntstatus = ads_find_dc(ads);
(gdb) n
ads_connect: start: ads=1314750
ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE
_tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377)
ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=AD.LIU.SE
ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=<null>
699             if (NT_STATUS_IS_OK(ntstatus)) {
(gdb) n
703             status = ADS_ERROR_NT(ntstatus);
(gdb) n
704             goto out;
(gdb) print status
$8 = {error_type = ENUM_ADS_ERROR_NT, err = {rc = -1073741730, nt_status = {v = 3221225566}}, minor_status = 0}
(gdb) n
784             if (DEBUGLEVEL >= 11) {
(gdb) n
792             return status;
(gdb) n
793     }
(gdb) n
ads_current_time (ads=0x1314500) at ../../source3/libads/ldap.c:3316
3316                    if ( !ADS_ERR_OK(status))
(gdb) n
3317                            goto done;
(gdb) print status
$9 = {error_type = ENUM_ADS_ERROR_NT, err = {rc = -1073741730, nt_status = {v = 3221225566}}, minor_status = 0}

(gdb) n
3347            if ( ads_s != ads ) {
(gdb) n
3350            talloc_destroy(ctx);
(gdb) print ads_s
$11 = (ADS_STRUCT *) 0x1314500
(gdb) print ads
$12 = (ADS_STRUCT *) 0x1314500
(gdb) n
3352            return status;
(gdb) n
3353    }
(gdb) print *ads
$13 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x0, ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, 
    no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", 
    kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, 
    renewable = 0}, config = {flags = 258556, realm = 0x0, bind_path = 0x0, ldap_server_name = 0x0, server_site_name = 0x0, 
    client_site_name = 0x0, current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = {
    wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, 
      needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, 
      sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\000\000\000", _ss_align = 0, 
      _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16458, port = 0}}
Comment 17 Jeremy Allison 2021-03-18 20:55:53 UTC
Why did:

 655                 ok = resolve_name(ads->server.ldap_server, &ss, 0x20, true);
 656                 if (!ok) {
 657                         DEBUG(5,("ads_connect: unable to resolve name %s\n",
 658                                  ads->server.ldap_server));
 659                         status = ADS_ERROR_NT(NT_STATUS_NOT_FOUND);
 660                         goto out;
 661                 }
 662                 ok = ads_try_connect(ads, ads->server.gc, &ss);
 663                 if (ok) {
 664                         goto got_connection;
 665                 }

Not succeed ? At that point, the resolve name for "lkpdc4-2019.ad.liu.se" succeeded and ss=130.236.30.34.

So why did:

 662                 ok = ads_try_connect(ads, ads->server.gc, &ss);
 663                 if (ok) {
 664                         goto got_connection;
 665                 }

not get ok=true and jump to the got_connection label ? That looks like the problem to me.
Comment 18 Peter Eriksson 2021-03-18 22:28:38 UTC
Created attachment 16559 [details]
Debug printf output from a run that is OK

Added an attachment with a debug printf run where things go OK.
Comment 19 Peter Eriksson 2021-03-18 22:29:10 UTC
Created attachment 16560 [details]
Debug printf output from a run that fails

Added an attachment with a debug printf run that fails.
Comment 20 Peter Eriksson 2021-03-18 22:35:06 UTC
Interesting - the difference between an OK run and a failed run is which AD server the ads_find_gc() returns - if it returns lkpdc1 then things work out fine.

If it returns one of the others (lkpdc3-2019 in this case) then it fails: 
  check_cldap_reply_required_flags(0x3f1fc,0x3f1fc) failed:
        ret_flags & NBT_SERVER_PDC

so it seems it tries again, and yet again gets another server (lkpdc4-2019 this time) and then it seems to just give up? Possibly because it got the one I pointed at with "-S"? Or does it just try twice?)

See part of the attached files (net-ads-info-ok.txt & net-ads-info-fail.txt) starting from "xxxxx"
Comment 21 Peter Eriksson 2021-03-19 23:20:40 UTC
Created attachment 16562 [details]
A more detailed printf-debug failed output

Attached a more detault print-debug-run of a failed "net ads info -S srvr" session.

There seems to be a couple of recursive calls happening inside/or caused by ads_current_time (if I'm reading the output right):

ads_current_time
  ads_connect
    ads_find_dc
      get_dc_name
        ads_dc_name
          ads_connect
            ads_find_dc
              
ads_current_time
  ads_connect
    ads_current_time
Comment 22 Louis 2021-07-20 12:30:59 UTC
Hai, i chip in this bugreport, i see the same 


running : net ads info  and it picks a random AD-DC. 
If its the one with FSMO roles, i get the correct info. 
If not, i get wrong output same as Peter (bug reporter) 

Other results
Buster, samba 4.12.5, 100% correct
Jessie, samba 4.8.12, 100% correct

i noticed i also had 1 server running samba 4.12.5 
 ( and fixed it autodate process,oeps, missed that. :-/ )

So i upgraded it to : 
Buster, samba 4.12.15, 100% correct
Upgrade to 4.13.10, again 100% correct 
Upgrade to 4.14.6, and it failty again. 

net ads info (random server, only shows correct info on the DS with FSMO.)
net ads info -S server.fdqn again,  only shows correct info on the DS with FSMO. 

ping me if someone needs more info.
Comment 23 Jeremy Allison 2022-01-28 18:31:50 UTC
Created attachment 17137 [details]
Untested "raw" patch.

Peter, sorry for the long inactivity on this bug. Can you test the attached "raw" patch if you can still reproduce the problem. It's in the right area and the current code is certainly wrong, so I'm hoping this might be the missing fix.
Comment 24 Jeremy Allison 2022-01-28 18:38:32 UTC
Created attachment 17138 [details]
Untested "raw" patch that actually compiles :-).

This one compiles, sorry :-).
Comment 25 Peter Eriksson 2022-01-29 00:15:15 UTC
(In reply to Jeremy Allison from comment #24)

Looking good! (Testing Samba 4.15.4 with that patch):

# /liu/bin/net ads info
LDAP server: 130.236.30.3
LDAP server name: lkpdc2.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: Sat, 29 Jan 2022 01:05:56 CET
KDC server: 130.236.30.3
Server time offset: 0
Last machine account password change: Tue, 22 May 2018 13:39:32 CEST


# /liu/bin/net ads info -S lkpdc1.ad.liu.se
LDAP server: 130.236.30.3
LDAP server name: lkpdc2.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: Sat, 29 Jan 2022 01:06:03 CET
KDC server: 130.236.30.35
Server time offset: 0
Last machine account password change: Tue, 22 May 2018 13:39:32 CEST

# /liu/bin/net ads info -S lkpdc3-2019.ad.liu.se
LDAP server: 130.236.30.3
LDAP server name: lkpdc2.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: Sat, 29 Jan 2022 01:06:14 CET
KDC server: 130.236.30.2
Server time offset: 0
Last machine account password change: Tue, 22 May 2018 13:39:32 CEST



Trying with a server that I've firewall'd away on the server (it's running, but Samba can't reach it) gives this though:

# /liu/bin/net ads info -S lkpdc4-2019.ad.liu.se
Failed to get server's current time!
LDAP server: 0.0.0.0
LDAP server name: lkpdc2.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 0
Server time: Thu, 01 Jan 1970 01:00:00 CET
KDC server: 130.236.30.3
Server time offset: 0
Last machine account password change: Tue, 22 May 2018 13:39:32 CEST

# /liu/bin/net ads info -S lkpdc4-2019.ad.liu.se
ads_connect: No logon servers are currently available to service the logon request.
ads_connect: No logon servers are currently available to service the logon request.
Didn't find the ldap server!

(and then it gives the same ads_connect error if I try again quickly, until I wait a little and then I get the first error again).

Probably not important but I thought I'd mention it anyway. And I should remove that firewall rule... :-)
Comment 26 Jeremy Allison 2022-01-29 00:17:19 UTC
I think it's OK to give the 0.0.0.0 address if it can't actually reach it. I don't think it will set the address in the AD struct until it's gotten a good connection.
Comment 27 Ralph Böhme 2022-01-29 05:37:49 UTC
(In reply to Peter Eriksson from comment #25)
Can you please rerun the exact same tests in the same configuration but without the patch? Just wamt to make sure the fix is really in that patch and not in the newer Samba version you're using in this test compared to the original report.

Thanks!
Comment 28 Peter Eriksson 2022-01-29 16:43:10 UTC
(In reply to Ralph Böhme from comment #27)

4.15.4 without the patch shows the bug:

# /liu/bin/net ads info 
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: Sat, 29 Jan 2022 17:41:48 CET
KDC server: 130.236.30.35
Server time offset: 0
Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET

# /liu/bin/net ads info -S lkpdc1.ad.liu.se
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: Sat, 29 Jan 2022 17:42:03 CET
KDC server: 130.236.30.35
Server time offset: 0
Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET

# /liu/bin/net ads info -S lkpdc2.ad.liu.se
LDAP server: 130.236.30.3
LDAP server name: lkpdc2.ad.liu.se
Realm: AD.LIU.SE
Bind Path: dc=AD,dc=LIU,dc=SE
LDAP port: 389
Server time: Sat, 29 Jan 2022 17:42:06 CET
KDC server: 130.236.30.3
Server time offset: 0
Last machine account password change: Tue, 11 Sep 2018 08:47:49 CEST

# /liu/bin/net ads info -S lkpdc3-2019.ad.liu.se
LDAP server: 0.0.0.0
LDAP server name: (null)
Realm: (null)
Bind Path: (null)
LDAP port: 0
Server time: Sat, 29 Jan 2022 17:42:10 CET
KDC server: 130.236.30.2
Server time offset: 0
Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET
Comment 29 Ralph Böhme 2022-02-08 16:03:18 UTC
It seems the patches being discussed at MR 2354 don't fix all cases, as I'm still seeing this issue at a customer site. I was able to spend some time in the debugger but didn't have enough time to identify the root cause, but I was able to figure out that socket info was already reset to 0 coming out of ads_startup_nobind() in net_ads_info(). This is totally strange as the log had the line

Successfully contacted LDAP server x.x.x.x

in ad_connect() and closest-dc was set and at the end I see

ads_connect: leaving with:
...
            ss                       : 1.2.3.4 (redacted known good IP)
...

However, it seems when ads_current_time() calls into ad_connect() again, suddenly ss is zeroed out which doesn't make sense.
Comment 30 Jeremy Allison 2022-02-08 19:26:27 UTC
(In reply to Ralph Böhme from comment #29)

That code (ads_current_time()) is the devils own work. I'll try and look again at it sometime.
Comment 31 Samba QA Contact 2022-02-08 20:25:05 UTC
This bug was referenced in samba master:

c266ed40aeb1b1f59a1811cd4511e32e44a4a719
3ee690455eb963dedc7955b79316481387d4ac8c
Comment 32 Jeremy Allison 2022-02-08 22:21:40 UTC
Created attachment 17155 [details]
git-am fix for 4.16.rcNext, 4.15.next, 4.14.next

Cherry-picked from master. Applies cleanly to 4.16.rcNext, 4.15.next, 4.14.next.
Comment 33 Ralph Böhme 2022-02-09 09:54:36 UTC
Reassigning to Jule for inclusion in 4.14, 4.15 and 4.16.

Please don't close the bug after the patches have landed and reassign to me. Thanks!
Comment 34 Samba QA Contact 2022-02-09 11:55:13 UTC
This bug was referenced in samba v4-14-test:

1cfb3b85f4929e8aea0641f34b2d2b06a30fc03c
e59bf6bb2c3fc1a2f9f37bdc587b7e0616f95613
Comment 35 Samba QA Contact 2022-02-09 12:25:13 UTC
This bug was referenced in samba v4-16-test:

8cbf38a1b2ba83cde9529d7e9d71bc8fce449293
188b96164c5849e28d0349adac6472e2bc4bd000
Comment 36 Samba QA Contact 2022-02-09 12:36:03 UTC
This bug was referenced in samba v4-15-test:

a2fe8bf5a5d0889dc53697904fd647812dc74c8c
ca94bad07bc1218fa406f1629536b5f3e7295cd8
Comment 37 Jule Anger 2022-02-09 14:09:16 UTC
(In reply to Ralph Böhme from comment #33)
Reassigning to Ralph.
Comment 38 Samba QA Contact 2022-02-15 07:55:45 UTC
This bug was referenced in samba v4-16-stable (Release samba-4.16.0rc3):

8cbf38a1b2ba83cde9529d7e9d71bc8fce449293
188b96164c5849e28d0349adac6472e2bc4bd000
Comment 39 Samba QA Contact 2022-03-15 13:25:25 UTC
This bug was referenced in samba v4-15-stable (Release samba-4.15.6):

a2fe8bf5a5d0889dc53697904fd647812dc74c8c
ca94bad07bc1218fa406f1629536b5f3e7295cd8
Comment 40 Samba QA Contact 2022-04-04 12:49:58 UTC
This bug was referenced in samba v4-14-stable (Release samba-4.14.13):

1cfb3b85f4929e8aea0641f34b2d2b06a30fc03c
e59bf6bb2c3fc1a2f9f37bdc587b7e0616f95613
Comment 41 Samba QA Contact 2022-05-23 19:19:03 UTC
This bug was referenced in samba master:

a26f535dedc651afa2a25dd37113ac71787197ff
Comment 42 Samuel Cabrero 2022-05-24 09:21:49 UTC
Created attachment 17301 [details]
Patch for 4.14 next, 4.15 next and 4.16 rc
Comment 43 Ralph Böhme 2022-05-24 10:25:03 UTC
Reassigning to Jule for inclusion in 4.15 and 4.16.