Created attachment 17630 [details] Patch for bug attached samba-tool gpo listall and all other samba-tool gpo commands fail with: dns child failed to find name '_ldap._tcp.SMB.ANATHOTH.NET' of type SRV finddcs: Failed to find SRV record for _ldap._tcp.SMB.ANATHOTH.NET ERROR(runtime): uncaught exception - ('Could not find a DC for domain', NTSTATUSError(3221225524, 'The object name is not found.')) File "/usr/lib/python3/dist-packages/samba/netcmd/__init__.py", line 186, in _run return self.run(*args, **kwargs) File "/usr/lib/python3/dist-packages/samba/netcmd/gpo.py", line 469, in run self.url = dc_url(self.lp, self.creds, H) File "/usr/lib/python3/dist-packages/samba/netcmd/gpo.py", line 127, in dc_url raise RuntimeError("Could not find a DC for domain", e) when there is only AAAA records for the DCs in the Samba domain DNS Adding an A record for the DC makes the samba-tool gpo listall command work. Bug found in source4/libcli/resolve/dns_ex.c, all addresses returned should be counted, and not just A records. Correct logic in get_a_aaaa_records() should beto send A query only if initial AAAA query did not return anything. Patch attached.
Patch makes samba-tool gpo listall and other commands function in IPv6 only enviroment, only AAAA record for DCs in Samba DNS
Hi Matthew, this looks good (modulo the variable name change, which I believe Michael Tokarev already commented on. I'll create a gitlab MR for this, which is the way things get into our code base right now. In the meantime, can you send in a Samba developers declaration as specified here: https://www.samba.org/samba/devel/copyright-policy.html so we have a record of you being able to contribute. Thanks ! Jeremy.
OK, Samba DD received. However, this patch causes: [59(410)/67 at 34m28s] samba.blackbox.wbinfo(rodc:local)(rodc:local) 1361UNEXPECTED(failure): samba.blackbox.wbinfo(rodc:local).wbinfo -N against rodc(rodc:local) 1362REASON: Exception: Exception: failed to call wbcResolveWinsByName: WBC_ERR_DOMAIN_NOT_FOUND 1363Could not lookup WINS by name RODC to fail. It might be an interaction with socket_wrapper. I'm investigating further.
Created attachment 17637 [details] git-am fix for master. Matt, this is what works to pass the ci build. I think the underlying problem is that with your fix it puts the IPv6 addresses first, which the ci infrastructure doesn't cope too well with. Can you test this fix instead, it does an IPv4 lookup followed by an IPv6 lookup (if enabled) and so the addresses are in the expected order. If should still work in your IPv6-only setup. Can you test this and confirm ? If you're happy I'll add it as an official MR. Cheers, Jeremy.
Comment on attachment 17637 [details] git-am fix for master. Hang on a minute Matt, the original summary of the problem is incorrect. I think the original logic is good. The original logic does the following: dns_lookup(..., QTYPE_AAAA, &reply); This is then passed to: total = reply_to_addrs(tmp_ctx, &a_num, &addrs, total, reply, port); Now, a_num here is only incremented if the DNS server returns A records as well when asked for AAAA records. total here is always incremented by the number of either A or AAAA records returned. For a name lookup that only has AAAA but no A records (i.e. IPv6-only), then the return from this will be: total = <number of AAAA records> a_num = 0 (there are no A records) We then go into: if (qtype == QTYPE_AAAA && a_num == 0) { /* * DNS server didn't returned A when asked for AAAA records. * Most of the server do it, let's ask for A specificaly. */ err = dns_lookup(tmp_ctx, name, QTYPE_A, &reply); if (!ERR_DNS_IS_OK(err)) { goto done; } total = reply_to_addrs(tmp_ctx, &a_num, &addrs, total, reply, port); } What this is doing is asking again for A records if the original query for AAAA records didn't return any A records. If the return is DNS_OK then it adds any A records to the addrs[] array appending at position addrs[total] and incrementing total by the number of A records added. So for a name with 3 AAAA records and 2 A records we'll end up with: addr[0] = AAAA record1 addr[1] = AAAA record2 addr[2] = AAAA record3 addr[3] = A record1 addr[4] = A record2 total = 5. This is (arguably) correct (though somewhat tortuous) logic. For an IPv6-only name we just have the first 3 entries and total = 3. Is it possible that when the second dns_lookup() request for A records is done it returns a status from the DNS server other than DNS_OK (0) ? That would explain it. Can you add a DEBUG statement here to check ? If that is the case, then the original logic is correct and the fix is simple. The chunk of code above should be changed to: if (qtype == QTYPE_AAAA && a_num == 0) { /* * DNS server didn't returned A when asked for AAAA records. * Most of the server do it, let's ask for A specificaly. */ err = dns_lookup(tmp_ctx, name, QTYPE_A, &reply); if (ERR_DNS_IS_OK(err)) { total = reply_to_addrs(tmp_ctx, &a_num, &addrs, total, reply, port); } } only adding the A records on success, and ignoring any failure. Do you concur ?
(In reply to Jeremy Allison from comment #5) I concur. Got success with the change you suggested above on my set up. samba-tool gpo listall works. As originally as in git source, get the failure to find the DC. Let me repeat, the change you suggested in get_a_aaaa_records(), source4/libcli/resolve/dns_ex.c makes samba-tool gpo listall return GPOs instead of failing with can't find the DC error. Got the DEBUG statement working. That second QTYPE_A query for the IPv4 records where you suggested the change was returning an error code of 10, ERROR_DNS_SOCKET_ERROR according to lib/addns/dnserr.h? Lets see what the MR test hookup does with the vampire_dc set up in that dsrepl test that was failing. Hope thats helpful! Thank you for your reaching out and your help! Its got me past this show stopper bug at work as I have to devops ansible stuff with setting up an AD server with GPO initialisation.
Bugzilla lost this: > Hi Jeremy! > Let me repeat, the change you suggested in get_a_aaaa_records(), > source4/libcli/resolve/dns_ex.c makes samba-tool gpo listall return GPOs > instead of failing with can't find the DC error. > Got the DEBUG statement working. That second QTYPE_A query for the IPv4 > records where you suggested the change was returning an error code of > 10, ERROR_DNS_SOCKET_ERROR according to lib/addns/dnserr.h? > Lets see what the MR test hookup does with the vampire_dc set up in that > dsrepl test that was failing. > Hope thats helpful! > Thank you for your reaching out and your help! Its got me past this show > stopper bug at work as I have to devops ansible stuff with setting up an > AD server with GPO initialisation. > Best Regards, > Matt Grant
Created attachment 17638 [details] Jeremy's tested by Matthew Grant proposed fix From discussion and my testing it looks like this is the fix for the issue. This bug affects a lot more than just the samba-tool gpo code, it also affects much of Samba AD via its usage in the source4 part of the tree.
MR: https://gitlab.com/samba-team/samba/-/merge_requests/2790
This bug was referenced in samba master: 10537a89bb0b461ba31d614b7c9ed56a842422e7
Created attachment 17642 [details] git-am fix for 4.17.next, 4.16.next. Cherry-pick from master. Applies cleanly to 4.17.next, 4.16.next.
One more for the next release...
Reassigning to Jule for inclusion in 4.16 and 4.17.
Pushed to autobuild-v4-{17,16}-test.
This bug was referenced in samba v4-16-test: 7b49569afcb968a5ac4b4fdd96480bb7b8ab01b7
This bug was referenced in samba v4-17-test: 02e63b6d336f3f8ece424cfe0577d4aad5f1069b
Closing out bug report. Thanks!
This bug was referenced in samba v4-17-stable (Release samba-4.17.5): 02e63b6d336f3f8ece424cfe0577d4aad5f1069b
This bug was referenced in samba v4-16-stable (Release samba-4.16.9): 7b49569afcb968a5ac4b4fdd96480bb7b8ab01b7