Our (Microsoft AD) has 6 AD DC servers, lkpdc1, lkpdc2, lkpdc3-2019, lkpdc4-2019, nkpdc1-2019 & nkpdc2-2019. Using "net ads info" and contacting lkpdc1 gives expected results all the time: # net ads info -S lkpdc1.ad.liu.se LDAP server: 130.236.30.35 LDAP server name: lkpdc1.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: tors, 18 mars 2021 19:26:25 CET KDC server: 130.236.30.35 Server time offset: 0 Last machine account password change: tors, 18 mars 2021 17:01:28 CET Contacting any of the others give the right result the first time, and then strange output the next times: # net ads info -S lkpdc4-2019.ad.liu.se LDAP server: 130.236.30.35 LDAP server name: lkpdc1.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: tors, 18 mars 2021 19:28:07 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 18 mars 2021 17:01:28 CET # net ads info -S lkpdc4-2019.ad.liu.se LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: tors, 18 mars 2021 19:28:10 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 01 jan. 1970 01:00:00 CET Most things _seem_ to work anyway but this is still a bit annoying...
Can you get me a debug level 10 client trace log of this ? Thanks.
Created attachment 16558 [details] Output from "net ads info" Here's the output from "net ads info -d10 -S lkpdc4-2019.ad.liu.se".
Don't know if it's related or not but while (so far fruitless) trying to trace down this issue, I noticed that: _tsocket_address_inet_from_string() get's called with an empty "addr" argument that doesn't look quite right. The code looks for addr being NULL and then assigns it to "127.0.0.1" or "::1" so I tried adding an "if (addr && !*addr) addr=NULL;" but it doesn't make any difference. Below are my added printf/debug outputs (130.236.30.34 = lkpdc4-2019) With addr=NULL assignment in _tsocket_address_inet_from_string: # net ads info -S lkpdc4-2019.ad.liu.se ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: tors, 18 mars 2021 19:42:01 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 01 jan. 1970 01:00:00 CET Without: # net ads info -S lkpdc4-2019.ad.liu.se ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.35, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.35, realm=<null> LDAP server: 130.236.30.35 LDAP server name: lkpdc1.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: tors, 18 mars 2021 19:46:11 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 18 mars 2021 17:01:28 CET # net ads info -S lkpdc4-2019.ad.liu.se ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: tors, 18 mars 2021 19:46:13 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 01 jan. 1970 01:00:00 CET
OK: 577 d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name); will allways print (null) without -S as "ads->config.ldap_server_name" is never set anywhere by discovery. That's just a bug. Should be doing: if (ads->config.ldap_server_name != NULL) { d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name); } else { d_printf(_("LDAP server name: %s\n"), ads->server.ldap_server); }
The curious issue is the address being printed as 0.0.0.0: 574 print_sockaddr(addr, sizeof(addr), &ads->ldap.ss); 575 576 d_printf(_("LDAP server: %s\n"), addr); Can you add me some debugs : 561 /* Try to set the server's current time since we didn't do a full 562 TCP LDAP session initially */ 563 -----------> Here add print_sockaddr(addr, sizeof(addr), &ads->ldap.ss); d_printf(_("LDAP server: %s\n"), addr); as I'd like to know if the call to ads_current_time() is screwing over the ads->ldap.ss value. 564 if ( !ADS_ERR_OK(ads_current_time( ads )) ) { 565 d_fprintf( stderr, _("Failed to get server's current time!\n")); 566 } 567 568 if (c->opt_json) { 569 return net_ads_info_json(ads); 570 } 571 572 pass_time = secrets_fetch_pass_last_set_time(ads->server.workgroup); 573 574 print_sockaddr(addr, sizeof(addr), &ads->ldap.ss); 575 576 d_printf(_("LDAP server: %s\n"), addr); 577 d_printf(_("LDAP server name: %s\n"), ads->config.ldap_server_name); 578 d_printf(_("Realm: %s\n"), ads->config.realm); 579 d_printf(_("Bind Path: %s\n"), ads->config.bind_path); 580 d_printf(_("LDAP port: %d\n"), ads->ldap.port); 581 d_printf(_("Server time: %s\n"), 582 http_timestring(talloc_tos(), ads->config.current_time));
You could also go into this via gdb and get the same info. Bottom line, using -S <server> bypasses discovery, so everything works. Without -S <server> we use discovery, so ads->config.ldap_server_name doesn't get set, only ads->server.ldap_server. But for success, ads->ldap.ss should be correctly set to the IP addr, so it'd be good to see where that's getting trampled on.
"net ads info" sometimes gives the same wrong info, I suppose it depends on which server it chooses. And yep, it looks like the ads_current_time calls is the culprit. (Some other debug printouts of mine there too, ignore those :-) # net ads info -S lkpdc4-2019.ad.liu.se ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> XXX Before ads_current_time: LDAP server: 130.236.30.34 ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> XXX After ads_current_time: LDAP server: 0.0.0.0 LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: tors, 18 mars 2021 20:35:25 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 01 jan. 1970 01:00:00 CET
Hmmm. I see you're always using -S name, so what I posted doesn't make sense :-). We need to investigate this more carefully... If the ads_current_time() is sabotaging the ads struct (I rue the day people decided to re-use these damn things) then that would explain it.
/* Try to set the server's current time since we didn't do a full TCP LDAP session initially */ print_sockaddr(addr, sizeof(addr), &ads->ldap.ss); d_printf("XXX Before ads_current_time: LDAP server: %s\n", addr); if ( !ADS_ERR_OK(ads_current_time( ads )) ) { d_fprintf( stderr, _("Failed to get server's current time!\n")); } print_sockaddr(addr, sizeof(addr), &ads->ldap.ss); d_printf("XXX After ads_current_time: LDAP server: %s\n", addr);
So this is the code that can trample on the existing ads struct inside ads_current_time(). 3283 /* establish a new ldap tcp session if necessary */ 3284 3285 if ( !ads->ldap.ld ) { 3286 /* 3287 * ADS_STRUCT may be being reused after a 3288 * DC lookup, so ads->ldap.ss may already have a 3289 * good address. If not, re-initialize the passed-in 3290 * ADS_STRUCT with the given server.XXXX parameters. 3291 * 3292 * Note that this doesn't depend on 3293 * ads->server.ldap_server != NULL, 3294 * as the case where ads->server.ldap_server==NULL and 3295 * ads->ldap.ss != zero_address is precisely the DC 3296 * lookup case where ads->ldap.ss was found by going 3297 * through ads_find_dc() again we want to avoid repeating. 3298 */ 3299 if (is_zero_addr(&ads->ldap.ss)) { 3300 ads_s = ads_init(ads->server.realm, 3301 ads->server.workgroup, 3302 ads->server.ldap_server, 3303 ADS_SASL_PLAIN ); 3304 if (ads_s == NULL) { 3305 status = ADS_ERROR(LDAP_NO_MEMORY); 3306 goto done; 3307 } 3308 } 3309 ads_s->auth.flags = ADS_AUTH_ANON_BIND; 3310 status = ads_connect( ads_s ); 3311 if ( !ADS_ERR_OK(status)) 3312 goto done; 3313 } Can you dump out the status of ads->ldap.ld. It's strange, because even if ads->ldap.ld == NULL, line 3299 checks that ads->ldap.ss is a zero addr before trampling on the struct.
Oh, I found it. It's here: 3277 ADS_STRUCT *ads_s = ads; ... 3285 if ( !ads->ldap.ld ) { 3286 /* 3287 * ADS_STRUCT may be being reused after a 3288 * DC lookup, so ads->ldap.ss may already have a 3289 * good address. If not, re-initialize the passed-in 3290 * ADS_STRUCT with the given server.XXXX parameters. 3291 * 3292 * Note that this doesn't depend on 3293 * ads->server.ldap_server != NULL, 3294 * as the case where ads->server.ldap_server==NULL and 3295 * ads->ldap.ss != zero_address is precisely the DC 3296 * lookup case where ads->ldap.ss was found by going 3297 * through ads_find_dc() again we want to avoid repeating. 3298 */ 3299 if (is_zero_addr(&ads->ldap.ss)) { 3300 ads_s = ads_init(ads->server.realm, 3301 ads->server.workgroup, 3302 ads->server.ldap_server, 3303 ADS_SASL_PLAIN ); 3304 if (ads_s == NULL) { 3305 status = ADS_ERROR(LDAP_NO_MEMORY); 3306 goto done; 3307 } 3308 } 3309 ads_s->auth.flags = ADS_AUTH_ANON_BIND; 3310 status = ads_connect( ads_s ); 3311 if ( !ADS_ERR_OK(status)) 3312 goto done; 3313 } The ads_connect( ads_s ) call at line 3310 whacks the existing ads struct at line 637 inside ads_connect(). 637 ads_zero_ldap(ads);
So that's forcing it to go through discovery.
Can you step into: ads_current_time() -> ads_connect() and let me know what code paths it's taking in there ? This is fiendishly over complicated for what it's doing and without walking through it I just can't see exactly what is happening here :-(.
FYI, I'm planning to give a talk on the fixes I did to the DC discovery code for ChromeOS at SambaXP ("Your Server will be with you shortly.." :-), in case you're interested. It's going to be a free conference. Feel free to broadcast widely in your org :-).
# net ads info -S lkpdc4-2019.ad.liu.se ads_connect: start: ads=99b750 ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_connect: start: ads=99d500 ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> XXX Before ads_current_time: LDAP server: 130.236.30.34 ads_current_time: Start: ads->ldap.ld = 0 ads_connect: start: ads=99d500 ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_connect: start: ads=99d750 ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_current_time: Start: ads->ldap.ld = 0 ads_current_time: is_zero_addr is true ads_connect: start: ads=9a45a0 ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> ads_current_time: Start: ads->ldap.ld = 96a720 XXX After ads_current_time: LDAP server: 0.0.0.0 LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: tors, 18 mars 2021 21:04:23 CET KDC server: 130.236.30.34 Server time offset: 0 Last machine account password change: tors, 01 jan. 1970 01:00:00 CET I'll do some gdb debugging/stepping...
Thread 2 hit Breakpoint 1, ads_current_time (ads=0x1314500) at ../../source3/libads/ldap.c:3274 3274 const char *attrs[] = {"currentTime", NULL}; ... gdb) print *ads $1 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 66, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}} (gdb) n 3279 ADS_STRUCT *ads_s = ads; (gdb) n 3281 if (!(ctx = talloc_init("ads_current_time"))) { (gdb) n 3287 fprintf(stderr, "ads_current_time: Start: ads->ldap.ld = %p\n", ads->ldap.ld); (gdb) list 3282 return ADS_ERROR(LDAP_NO_MEMORY); 3283 } 3284 3285 /* establish a new ldap tcp session if necessary */ 3286 3287 fprintf(stderr, "ads_current_time: Start: ads->ldap.ld = %p\n", ads->ldap.ld); 3288 if ( !ads->ldap.ld ) { 3289 /* 3290 * ADS_STRUCT may be being reused after a 3291 * DC lookup, so ads->ldap.ss may already have a (gdb) n ads_current_time: Start: ads->ldap.ld = 0 3288 if ( !ads->ldap.ld ) { (gdb) n 3302 if (is_zero_addr(&ads->ldap.ss)) { (gdb) n 3314 ads_s->auth.flags = ADS_AUTH_ANON_BIND; (gdb) list 3309 if (ads_s == NULL) { 3310 status = ADS_ERROR(LDAP_NO_MEMORY); 3311 goto done; 3312 } 3313 } 3314 ads_s->auth.flags = ADS_AUTH_ANON_BIND; 3315 status = ads_connect( ads_s ); 3316 if ( !ADS_ERR_OK(status)) 3317 goto done; 3318 } (gdb) n 3315 status = ads_connect( ads_s ); (gdb) print ads $2 = (ADS_STRUCT *) 0x1314500 (gdb) print ads_s $3 = (ADS_STRUCT *) 0x1314500 (gdb) print *ads $4 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}} (gdb) print *ads_s $5 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}} (gdb) s ads_connect (ads=0x1314500) at ../../source3/libads/ldap.c:604 604 int version = LDAP_VERSION3; (gdb) list 599 * @param ads Pointer to an existing ADS_STRUCT 600 * @return status of connection 601 **/ 602 ADS_STATUS ads_connect(ADS_STRUCT *ads) 603 { 604 int version = LDAP_VERSION3; 605 ADS_STATUS status; 606 NTSTATUS ntstatus; 607 char addr[INET6_ADDRSTRLEN]; 608 struct samba_sockaddr existing_sa = {0}; (gdb) n 608 struct samba_sockaddr existing_sa = {0}; (gdb) n 610 fprintf(stderr, "ads_connect: start: ads=%p\n", ads); (gdb) n ads_connect: start: ads=1314500 630 if (ads->server.ldap_server == NULL && !is_zero_addr(&ads->ldap.ss)) { (gdb) print *ads $6 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\202\354\036\"", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16018, port = 389}} (gdb) n 639 ads_zero_ldap(ads); gdb) n 640 ZERO_STRUCT(ads->ldap_wrap_data); (gdb) print *ads $7 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x12e4000 "AD", ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x12e3fa0 "AD.LIU.SE", bind_path = 0x12e2f90 "dc=AD,dc=LIU,dc=SE", ldap_server_name = 0x12e2f30 "lkpdc4-2019.ad.liu.se", server_site_name = 0x12e3fe0 "LiU", client_site_name = 0x12e4020 "LiU", current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\000\000\000", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 0, port = 0}} (gdb) n 641 ads->ldap.last_attempt = time_mono(NULL); (gdb) n 642 ads->ldap_wrap_data.wrap_type = ADS_SASLWRAP_TYPE_PLAIN; (gdb) n 646 if (DEBUGLEVEL >= 11) { (gdb) n 653 if (ads->server.ldap_server) { (gdb) n 654 bool ok = false; (gdb) n 657 ok = resolve_name(ads->server.ldap_server, &ss, 0x20, true); (gdb) n 658 if (!ok) { (gdb) n 664 ok = ads_try_connect(ads, ads->server.gc, &ss); (gdb) n ads_cldap_netlogon_5: Start: ss=130.236.30.34, realm=<null> 665 if (ok) { (gdb) n 674 if (ads->server.gc == true) { (gdb) n 678 if (ads->server.no_fallback) { (gdb) n 684 if (!is_zero_addr(&existing_sa.u.ss)) { (gdb) n 687 ads->server.gc, (gdb) n 686 bool ok = ads_try_connect(ads, (gdb) n ads_cldap_netlogon_5: Start: ss=, realm=<null> _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) 689 if (ok) { (gdb) n 698 ntstatus = ads_find_dc(ads); (gdb) n ads_connect: start: ads=1314750 ads_cldap_netlogon_5: Start: ss=, realm=AD.LIU.SE _tsocket_address_inet_from_string: got empty addr (port=389, location=../../source3/libads/cldap.c:377) ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=AD.LIU.SE ads_cldap_netlogon_5: Start: ss=130.236.30.2, realm=<null> 699 if (NT_STATUS_IS_OK(ntstatus)) { (gdb) n 703 status = ADS_ERROR_NT(ntstatus); (gdb) n 704 goto out; (gdb) print status $8 = {error_type = ENUM_ADS_ERROR_NT, err = {rc = -1073741730, nt_status = {v = 3221225566}}, minor_status = 0} (gdb) n 784 if (DEBUGLEVEL >= 11) { (gdb) n 792 return status; (gdb) n 793 } (gdb) n ads_current_time (ads=0x1314500) at ../../source3/libads/ldap.c:3316 3316 if ( !ADS_ERR_OK(status)) (gdb) n 3317 goto done; (gdb) print status $9 = {error_type = ENUM_ADS_ERROR_NT, err = {rc = -1073741730, nt_status = {v = 3221225566}}, minor_status = 0} (gdb) n 3347 if ( ads_s != ads ) { (gdb) n 3350 talloc_destroy(ctx); (gdb) print ads_s $11 = (ADS_STRUCT *) 0x1314500 (gdb) print ads $12 = (ADS_STRUCT *) 0x1314500 (gdb) n 3352 return status; (gdb) n 3353 } (gdb) print *ads $13 = {is_mine = 1, server = {realm = 0x0, workgroup = 0x0, ldap_server = 0x12e2e40 "lkpdc4-2019.ad.liu.se", gc = false, no_fallback = false}, auth = {realm = 0x12e4040 "AD.LIU.SE", password = 0x0, user_name = 0x12e3fc0 "root", kdc_server = 0x12e4060 "130.236.30.34", flags = 4, time_offset = 0, ccache_name = 0x0, tgt_expire = 0, tgs_expire = 0, renewable = 0}, config = {flags = 258556, realm = 0x0, bind_path = 0x0, ldap_server_name = 0x0, server_site_name = 0x0, client_site_name = 0x0, current_time = 0, schema_path = 0x0, config_path = 0x0, ldap_page_size = 1000}, ldap_wrap_data = { wrap_type = ADS_SASLWRAP_TYPE_PLAIN, wrap_ops = 0x0, sbiod = 0x0, mem_ctx = 0x0, wrap_private_data = 0x0, in = {ofs = 0, needed = 0, left = 0, max_wrapped = 0, min_wrapped = 0, size = 0, buf = 0x0}, out = {ofs = 0, left = 0, max_unwrapped = 0, sig_size = 0, size = 0, buf = 0x0}}, ldap = {ld = 0x0, ss = {ss_family = 2, _ss_pad1 = "\000\000\000\000\000", _ss_align = 0, _ss_pad2 = '\000' <repeats 239 times>}, last_attempt = 16458, port = 0}}
Why did: 655 ok = resolve_name(ads->server.ldap_server, &ss, 0x20, true); 656 if (!ok) { 657 DEBUG(5,("ads_connect: unable to resolve name %s\n", 658 ads->server.ldap_server)); 659 status = ADS_ERROR_NT(NT_STATUS_NOT_FOUND); 660 goto out; 661 } 662 ok = ads_try_connect(ads, ads->server.gc, &ss); 663 if (ok) { 664 goto got_connection; 665 } Not succeed ? At that point, the resolve name for "lkpdc4-2019.ad.liu.se" succeeded and ss=130.236.30.34. So why did: 662 ok = ads_try_connect(ads, ads->server.gc, &ss); 663 if (ok) { 664 goto got_connection; 665 } not get ok=true and jump to the got_connection label ? That looks like the problem to me.
Created attachment 16559 [details] Debug printf output from a run that is OK Added an attachment with a debug printf run where things go OK.
Created attachment 16560 [details] Debug printf output from a run that fails Added an attachment with a debug printf run that fails.
Interesting - the difference between an OK run and a failed run is which AD server the ads_find_gc() returns - if it returns lkpdc1 then things work out fine. If it returns one of the others (lkpdc3-2019 in this case) then it fails: check_cldap_reply_required_flags(0x3f1fc,0x3f1fc) failed: ret_flags & NBT_SERVER_PDC so it seems it tries again, and yet again gets another server (lkpdc4-2019 this time) and then it seems to just give up? Possibly because it got the one I pointed at with "-S"? Or does it just try twice?) See part of the attached files (net-ads-info-ok.txt & net-ads-info-fail.txt) starting from "xxxxx"
Created attachment 16562 [details] A more detailed printf-debug failed output Attached a more detault print-debug-run of a failed "net ads info -S srvr" session. There seems to be a couple of recursive calls happening inside/or caused by ads_current_time (if I'm reading the output right): ads_current_time ads_connect ads_find_dc get_dc_name ads_dc_name ads_connect ads_find_dc ads_current_time ads_connect ads_current_time
Hai, i chip in this bugreport, i see the same running : net ads info and it picks a random AD-DC. If its the one with FSMO roles, i get the correct info. If not, i get wrong output same as Peter (bug reporter) Other results Buster, samba 4.12.5, 100% correct Jessie, samba 4.8.12, 100% correct i noticed i also had 1 server running samba 4.12.5 ( and fixed it autodate process,oeps, missed that. :-/ ) So i upgraded it to : Buster, samba 4.12.15, 100% correct Upgrade to 4.13.10, again 100% correct Upgrade to 4.14.6, and it failty again. net ads info (random server, only shows correct info on the DS with FSMO.) net ads info -S server.fdqn again, only shows correct info on the DS with FSMO. ping me if someone needs more info.
Created attachment 17137 [details] Untested "raw" patch. Peter, sorry for the long inactivity on this bug. Can you test the attached "raw" patch if you can still reproduce the problem. It's in the right area and the current code is certainly wrong, so I'm hoping this might be the missing fix.
Created attachment 17138 [details] Untested "raw" patch that actually compiles :-). This one compiles, sorry :-).
(In reply to Jeremy Allison from comment #24) Looking good! (Testing Samba 4.15.4 with that patch): # /liu/bin/net ads info LDAP server: 130.236.30.3 LDAP server name: lkpdc2.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: Sat, 29 Jan 2022 01:05:56 CET KDC server: 130.236.30.3 Server time offset: 0 Last machine account password change: Tue, 22 May 2018 13:39:32 CEST # /liu/bin/net ads info -S lkpdc1.ad.liu.se LDAP server: 130.236.30.3 LDAP server name: lkpdc2.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: Sat, 29 Jan 2022 01:06:03 CET KDC server: 130.236.30.35 Server time offset: 0 Last machine account password change: Tue, 22 May 2018 13:39:32 CEST # /liu/bin/net ads info -S lkpdc3-2019.ad.liu.se LDAP server: 130.236.30.3 LDAP server name: lkpdc2.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: Sat, 29 Jan 2022 01:06:14 CET KDC server: 130.236.30.2 Server time offset: 0 Last machine account password change: Tue, 22 May 2018 13:39:32 CEST Trying with a server that I've firewall'd away on the server (it's running, but Samba can't reach it) gives this though: # /liu/bin/net ads info -S lkpdc4-2019.ad.liu.se Failed to get server's current time! LDAP server: 0.0.0.0 LDAP server name: lkpdc2.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 0 Server time: Thu, 01 Jan 1970 01:00:00 CET KDC server: 130.236.30.3 Server time offset: 0 Last machine account password change: Tue, 22 May 2018 13:39:32 CEST # /liu/bin/net ads info -S lkpdc4-2019.ad.liu.se ads_connect: No logon servers are currently available to service the logon request. ads_connect: No logon servers are currently available to service the logon request. Didn't find the ldap server! (and then it gives the same ads_connect error if I try again quickly, until I wait a little and then I get the first error again). Probably not important but I thought I'd mention it anyway. And I should remove that firewall rule... :-)
I think it's OK to give the 0.0.0.0 address if it can't actually reach it. I don't think it will set the address in the AD struct until it's gotten a good connection.
(In reply to Peter Eriksson from comment #25) Can you please rerun the exact same tests in the same configuration but without the patch? Just wamt to make sure the fix is really in that patch and not in the newer Samba version you're using in this test compared to the original report. Thanks!
(In reply to Ralph Böhme from comment #27) 4.15.4 without the patch shows the bug: # /liu/bin/net ads info LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: Sat, 29 Jan 2022 17:41:48 CET KDC server: 130.236.30.35 Server time offset: 0 Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET # /liu/bin/net ads info -S lkpdc1.ad.liu.se LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: Sat, 29 Jan 2022 17:42:03 CET KDC server: 130.236.30.35 Server time offset: 0 Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET # /liu/bin/net ads info -S lkpdc2.ad.liu.se LDAP server: 130.236.30.3 LDAP server name: lkpdc2.ad.liu.se Realm: AD.LIU.SE Bind Path: dc=AD,dc=LIU,dc=SE LDAP port: 389 Server time: Sat, 29 Jan 2022 17:42:06 CET KDC server: 130.236.30.3 Server time offset: 0 Last machine account password change: Tue, 11 Sep 2018 08:47:49 CEST # /liu/bin/net ads info -S lkpdc3-2019.ad.liu.se LDAP server: 0.0.0.0 LDAP server name: (null) Realm: (null) Bind Path: (null) LDAP port: 0 Server time: Sat, 29 Jan 2022 17:42:10 CET KDC server: 130.236.30.2 Server time offset: 0 Last machine account password change: Thu, 01 Jan 1970 01:00:00 CET
It seems the patches being discussed at MR 2354 don't fix all cases, as I'm still seeing this issue at a customer site. I was able to spend some time in the debugger but didn't have enough time to identify the root cause, but I was able to figure out that socket info was already reset to 0 coming out of ads_startup_nobind() in net_ads_info(). This is totally strange as the log had the line Successfully contacted LDAP server x.x.x.x in ad_connect() and closest-dc was set and at the end I see ads_connect: leaving with: ... ss : 1.2.3.4 (redacted known good IP) ... However, it seems when ads_current_time() calls into ad_connect() again, suddenly ss is zeroed out which doesn't make sense.
(In reply to Ralph Böhme from comment #29) That code (ads_current_time()) is the devils own work. I'll try and look again at it sometime.
This bug was referenced in samba master: c266ed40aeb1b1f59a1811cd4511e32e44a4a719 3ee690455eb963dedc7955b79316481387d4ac8c
Created attachment 17155 [details] git-am fix for 4.16.rcNext, 4.15.next, 4.14.next Cherry-picked from master. Applies cleanly to 4.16.rcNext, 4.15.next, 4.14.next.
Reassigning to Jule for inclusion in 4.14, 4.15 and 4.16. Please don't close the bug after the patches have landed and reassign to me. Thanks!
This bug was referenced in samba v4-14-test: 1cfb3b85f4929e8aea0641f34b2d2b06a30fc03c e59bf6bb2c3fc1a2f9f37bdc587b7e0616f95613
This bug was referenced in samba v4-16-test: 8cbf38a1b2ba83cde9529d7e9d71bc8fce449293 188b96164c5849e28d0349adac6472e2bc4bd000
This bug was referenced in samba v4-15-test: a2fe8bf5a5d0889dc53697904fd647812dc74c8c ca94bad07bc1218fa406f1629536b5f3e7295cd8
(In reply to Ralph Böhme from comment #33) Reassigning to Ralph.
This bug was referenced in samba v4-16-stable (Release samba-4.16.0rc3): 8cbf38a1b2ba83cde9529d7e9d71bc8fce449293 188b96164c5849e28d0349adac6472e2bc4bd000
This bug was referenced in samba v4-15-stable (Release samba-4.15.6): a2fe8bf5a5d0889dc53697904fd647812dc74c8c ca94bad07bc1218fa406f1629536b5f3e7295cd8
This bug was referenced in samba v4-14-stable (Release samba-4.14.13): 1cfb3b85f4929e8aea0641f34b2d2b06a30fc03c e59bf6bb2c3fc1a2f9f37bdc587b7e0616f95613
This bug was referenced in samba master: a26f535dedc651afa2a25dd37113ac71787197ff
Created attachment 17301 [details] Patch for 4.14 next, 4.15 next and 4.16 rc
Reassigning to Jule for inclusion in 4.15 and 4.16.
Pushed to autobuild-v4-{16,15}-test. Reviewed-by tag added.
This bug was referenced in samba v4-16-test: 53ac81eef24f1c60d2d9cdc9c5f21ade32275d81
This bug was referenced in samba v4-15-test: c53efe8b48166a9ba7b42f37a7e9e712fba13486
Closing out bug report. Thanks!
This bug was referenced in samba v4-16-stable (Release samba-4.16.2): 53ac81eef24f1c60d2d9cdc9c5f21ade32275d81
This bug was referenced in samba v4-15-stable (Release samba-4.15.8): c53efe8b48166a9ba7b42f37a7e9e712fba13486