Version 3.4.0pre2 on Linux 2.6.25.3 When the server is not specified, the error message is:- Failed to join domain: failed to find DC for domain [domain] When the server is specified by address with "-S" the error message is:- Packet send failed to 0.0.7.209(137) ERRNO=Invalid argument Failed to join domain: failed to connect to AD: No logon servers When the server is specified by name with "-S" the error message is:- Failed to join domain: failed to connect to AD: No logon servers Debug logs to follow.
Created attachment 4240 [details] Log from net -d 10 (no server specified)
Created attachment 4241 [details] Log from net -d 10 (server specified by address)
Created attachment 4242 [details] Log from net -d 10 (server specified by name)
Chris, I have had this working myself. So a few questions... 1) What was the command line that you used for the join? 2) Have you verified that Kerberos will work over IPv6 to the AD DC? 3) Have you verified that you can do DNS lookups to the AD DC? Thanks, David
1. The commands used were `net ads join -U chaz`, `net ads join -U chaz -S fmusptwdc002.keim.vsix.me` and `net ads join -U chaz -S 2001:470:8a93:2::4` 2. I was able to successfully obtain a ticket with kinit. 3. The former implies that dns resolution is working, but DNS works fine on the host:- $ dig +short _ldap._tcp.keim.vsix.me. in srv 0 100 389 fmusptwdc002.keim.vsix.me. Thanks, Chris
uhm I think I know what's going on here, we are hitting a code path that is not IPv6 ready: lib/util/util/util_net.c: interpret_addr() Still investigating 2nd and 3rd trace
There are also other functions used by net that are not ipv6 ready :/ ads_cldap_netlogon() in source3/libads/cldap.c has a revealing comment: /* TODO: support ipv6 */
Looks like open_udp_socket() is not IPv6 clean... Jeremy.
(In reply to comment #8) > Looks like open_udp_socket() is not IPv6 clean... > Jeremy. It's more than that, it looks like all the client cldap code is ipv4 only. Simo.
Created attachment 4263 [details] Patch to make open_udp_socket() IPv6 clean. This patch will make open_udp_socket(). It might be enough to fix the bug, if someone with a reproducible test setup could check I'd appreciate it. Thanks ! Jeremy.
(In reply to comment #9) > (In reply to comment #8) > > Looks like open_udp_socket() is not IPv6 clean... > > Jeremy. > > It's more than that, it looks like all the client cldap code is ipv4 only. I don't see any other dependencies on IPv4 in the cldap code other than opening the udp socket (which I should just have fixed). All hostnames are strings in the cldap packet itself, so I'm hoping the patch I posted should be enough. What other IPv4-only code do you see in libads/cldap.c ? Jeremy.
(In reply to comment #11) > (In reply to comment #9) > > (In reply to comment #8) > > > Looks like open_udp_socket() is not IPv6 clean... > > > Jeremy. > > > > It's more than that, it looks like all the client cldap code is ipv4 only. > > I don't see any other dependencies on IPv4 in the cldap code other than opening > the udp socket (which I should just have fixed). All hostnames are strings in > the cldap packet itself, so I'm hoping the patch I posted should be enough. > > What other IPv4-only code do you see in libads/cldap.c ? I traced the fault to the use of: source3/libads/cldap.c:ads_cldap_netlogon() There interpret_addr2 is used and it is just a wrapper for: lib/util/util_net.c:interpret_addr() Which is an IPv4 only function as well. In ads_cldap_netlogon() the function is_zero_ip_v4() also is quite revealing, as well as the following inet_ntop() tsocket_address_inet_from_strings(mem_ctx, "ipv4", ... also quite explicit. Simo. This is an ipv4 only function. Other code uses ipv4 only functions to resolve addresses.
Are we looking at the same code here ? I'm looking in the 3.4 ads_cldap_netlogon() function inside libads/cldap.c at line 243. I don't see any of the functions interpret_addr2 or is_zero_ip_v4() in that code at all ? This is what it looks like in 3.4.x (below). The only function that needed fixing to be IPv6 clean that I could see is open_udp_socket(), which is what the patch addresses. Jeremy. 239 /******************************************************************* 240 do a cldap netlogon query. Always 389/udp 241 *******************************************************************/ 242 243 bool ads_cldap_netlogon(TALLOC_CTX *mem_ctx, 244 const char *server, 245 const char *realm, 246 uint32_t nt_version, 247 struct netlogon_samlogon_response **reply) 248 { 249 int sock; 250 int ret; 251 252 sock = open_udp_socket(server, LDAP_PORT ); 253 if (sock == -1) { 254 DEBUG(2,("ads_cldap_netlogon: Failed to open udp socket to %s\n", 255 server)); 256 return False; 257 } 258 259 ret = send_cldap_netlogon(mem_ctx, sock, realm, global_myname(), nt_version); 260 if (ret != 0) { 261 close(sock); 262 return False; 263 } 264 ret = recv_cldap_netlogon(mem_ctx, sock, nt_version, reply); 265 close(sock); 266 267 if (ret == -1) { 268 return False; 269 } 270 271 return True; 272 }
(In reply to comment #13) > Are we looking at the same code here ? I'm looking in the 3.4 > ads_cldap_netlogon() function inside libads/cldap.c at line 243. I don't see > any of the functions interpret_addr2 or is_zero_ip_v4() in that code at all ? > > This is what it looks like in 3.4.x (below). The only function that needed > fixing to be IPv6 clean that I could see is open_udp_socket(), which is what > the patch addresses. Oh! I am sorry I was looking into master, I didn't know master was so different here. It seem that master has extensive regressions then. Simo.
Ah indeed. Looks like this got messed up badly in master. I'll fix. Jeremy.
Jeremy, any chance to look at this again ? ipv6 join would be really nice to have working again for 3.4...
The patch already posted here: https://bugzilla.samba.org/attachment.cgi?id=4263 is for 3.4 and should fix it. Just needs someone with a test env to check it, but it's obvious goodness and should be in 3.4 (just fixes open_udp_socket() to be IPv5 clean). Jeremy.
Thanks, I shall try this today.
Created attachment 4310 [details] 2nd try - Log from net -d 10 (no server specified)
Created attachment 4311 [details] 2nd try - Log from net -d 10 (server specified by name)
Created attachment 4312 [details] 2nd try - Log from net -d 10 (server specified by address)
Unfortunately I was still not able to join the domain with the patch.
Created attachment 4316 [details] Patch to get more info. The problem in your traces is that the reply from the recv_cldap_netlogon() isn't being seen - we always get the message "no reply received to cldap netlogon" printed. I need to know why. This patch will tell me if it was the select that failed, or the read that failed. Can you apply this to v3-4-test and repeat the experiment (and add the logs) please. What would also help is a wireshark capture trace from the box run at the same time also. It seems to be able to correctly connect to the DC using IPv6, but not get a reply from the UDP packet. Jeremy.
Created attachment 4322 [details] 3rd try - Log from net -d 10 (no server specified)
Created attachment 4323 [details] 3rd try - Log from net -d 10 (server specified by name)
Created attachment 4324 [details] 3rd try - Log from net -d 10 (server specified by address)
Created attachment 4325 [details] Packet capture from client (no server specified)
Created attachment 4326 [details] Packet capture from client (server specified by name)
Created attachment 4327 [details] Packet capture from client (server specified by address)
Created attachment 4328 [details] Packet capture from server (no server specified)
Created attachment 4329 [details] Packet capture from server (server specified by name)
Created attachment 4330 [details] Packet capture from server (server specified by address)
Ok, looking at the logs and packet capture traces, I'm seeing the same error in each of them: [2009/06/19 14:14:26, 1] libads/cldap.c:166(recv_cldap_netlogon) no reply received to cldap netlogon (ret = -1: Error = Permission denied) The requesting CLDAP packet is being sent out from the client (as seen by the client packet capture), but nothing is seen at the server. Permission denied is a very strange error to be getting on a read call.. Do you have some sort of a firewall or SeLinux or AppArmour running that might block IPv6 udp packets on port 389 from getting out or being received ? I can't see any errors in the Samba code that might cause this. Jeremy.
I do have a firewall running on the client, but not selinux, apparmor or anything like that. The firewall allows all outbound connections without restriction, as well as established connections. Just in case, I tried explicitly adding the domain controller to the firewall, but that did not make any difference. If there's anything else I can try please let me know!
Can you try turning all firewalls off for the duration of the test please ? I don't understand what I'm seeing here. The packet doing the IPv6 UDP CLDAP query is seen by the client capture trace, but not seen on the server side, so no wonder it doesn't respond to it. I don't understand why that is. Is there a firewall running on the server ? Jeremy.
(In reply to comment #35) > Can you try turning all firewalls off for the duration of the test please ? I > don't understand what I'm seeing here. The packet doing the IPv6 UDP CLDAP > query is seen by the client capture trace, but not seen on the server side, so > no wonder it doesn't respond to it. I don't understand why that is. Is there a > firewall running on the server ? > Jeremy. > keim.vsix.me is behind a cisco firewall, I had unfortunately only had Chris's ip space open on tcp. I have opened it for udp as well and we will retest. Thanks! Pete
Created attachment 4334 [details] 4th try - Log from net -d 10 (server specified) This time the process got further, but unfortunately it still fails, with the message:- Failed to join domain: failed to connect to AD: Server not found in Kerberos database
Jeremy, I've just tested the open_udp_socket() patch, and it fixes IPv6-only join of today's v3-4-test samba to an ipv6-only win2k8 domain controller. I did not see the "server not found in kerberos database" issue here.
Ok, I see a machine account on the DC, but net ads testjoin still fails with "Cannot contact any KDC for requested realm". smbd spits the same error message.
I think it's clear that the patches listed in this bugid are correct for 3.4.0, in that they correctly fix the IPv6 part of the problem. Karolin, once Guenther reviews can you add this fix for 3.4.0. Thanks, Jeremy.
Patch (attachment id 4263) looks good.
Pushed "Patch to make open_udp_socket() IPv6 clean" to v3-4-test. Jeremy, the "Patch to get more info" should be pushed, too? Can we close the bug report after that?
Yes please push the "Patch to get more info" also as it makes errors comprehensible in the log. Thanks ! Jeremy.
Done. Closing out bug report. Thanks!