Bug 6437 - Unable to join IPv6-only ads domain
Summary: Unable to join IPv6-only ads domain
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.4
Classification: Unclassified
Component: Domain Control (show other bugs)
Version: unspecified
Hardware: x86 Linux
: P3 critical
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-04 19:30 UTC by Chris Hills
Modified: 2009-07-30 01:57 UTC (History)
3 users (show)

See Also:
gd: review+
vl: review+


Attachments
Log from net -d 10 (no server specified) (11.55 KB, text/plain)
2009-06-04 19:31 UTC, Chris Hills
no flags Details
Log from net -d 10 (server specified by address) (132.03 KB, text/plain)
2009-06-04 19:31 UTC, Chris Hills
no flags Details
Log from net -d 10 (server specified by name) (130.51 KB, text/plain)
2009-06-04 19:32 UTC, Chris Hills
no flags Details
Patch to make open_udp_socket() IPv6 clean. (1.49 KB, patch)
2009-06-08 14:38 UTC, Jeremy Allison
no flags Details
2nd try - Log from net -d 10 (no server specified) (14.28 KB, text/plain)
2009-06-18 13:00 UTC, Chris Hills
no flags Details
2nd try - Log from net -d 10 (server specified by name) (133.91 KB, text/plain)
2009-06-18 13:01 UTC, Chris Hills
no flags Details
2nd try - Log from net -d 10 (server specified by address) (132.46 KB, text/plain)
2009-06-18 13:01 UTC, Chris Hills
no flags Details
Patch to get more info. (1.10 KB, patch)
2009-06-18 18:47 UTC, Jeremy Allison
no flags Details
3rd try - Log from net -d 10 (no server specified) (14.38 KB, text/plain)
2009-06-19 08:12 UTC, Chris Hills
no flags Details
3rd try - Log from net -d 10 (server specified by name) (133.28 KB, text/plain)
2009-06-19 08:13 UTC, Chris Hills
no flags Details
3rd try - Log from net -d 10 (server specified by address) (132.11 KB, text/plain)
2009-06-19 08:13 UTC, Chris Hills
no flags Details
Packet capture from client (no server specified) (506 bytes, application/octet-stream)
2009-06-19 08:14 UTC, Chris Hills
no flags Details
Packet capture from client (server specified by name) (133.28 KB, application/octet-stream)
2009-06-19 08:14 UTC, Chris Hills
no flags Details
Packet capture from client (server specified by address) (12.08 KB, application/octet-stream)
2009-06-19 08:15 UTC, Chris Hills
no flags Details
Packet capture from server (no server specified) (377 bytes, application/octet-stream)
2009-06-19 08:19 UTC, Peter Grace
no flags Details
Packet capture from server (server specified by name) (12.58 KB, application/octet-stream)
2009-06-19 08:20 UTC, Peter Grace
no flags Details
Packet capture from server (server specified by address) (12.75 KB, application/octet-stream)
2009-06-19 08:20 UTC, Peter Grace
no flags Details
4th try - Log from net -d 10 (server specified) (122.12 KB, text/plain)
2009-06-20 03:42 UTC, Chris Hills
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Hills 2009-06-04 19:30:13 UTC
Version 3.4.0pre2 on Linux 2.6.25.3

When the server is not specified, the error message is:-
Failed to join domain: failed to find DC for domain [domain]

When the server is specified by address with "-S" the error message is:-
  Packet send failed to 0.0.7.209(137) ERRNO=Invalid argument
Failed to join domain: failed to connect to AD: No logon servers

When the server is specified by name with "-S" the error message is:-
Failed to join domain: failed to connect to AD: No logon servers

Debug logs to follow.
Comment 1 Chris Hills 2009-06-04 19:31:10 UTC
Created attachment 4240 [details]
Log from net -d 10 (no server specified)
Comment 2 Chris Hills 2009-06-04 19:31:42 UTC
Created attachment 4241 [details]
Log from net -d 10 (server specified by address)
Comment 3 Chris Hills 2009-06-04 19:32:12 UTC
Created attachment 4242 [details]
Log from net -d 10 (server specified by name)
Comment 4 David Holder 2009-06-06 08:06:04 UTC
Chris,

I have had this working myself. So a few questions...

1) What was the command line that you used for the join?
2) Have you verified that Kerberos will work over IPv6 to the AD DC?
3) Have you verified that you can do DNS lookups to the AD DC?

Thanks,
David
Comment 5 Chris Hills 2009-06-06 09:15:05 UTC
1. The commands used were `net ads join -U chaz`, `net ads join -U chaz -S fmusptwdc002.keim.vsix.me` and `net ads join -U chaz -S 2001:470:8a93:2::4`
2. I was able to successfully obtain a ticket with kinit.
3. The former implies that dns resolution is working, but DNS works fine on the host:-
$ dig +short _ldap._tcp.keim.vsix.me. in srv
0 100 389 fmusptwdc002.keim.vsix.me.

Thanks,
Chris
Comment 6 Simo Sorce 2009-06-06 09:28:19 UTC
uhm I think I know what's going on here, we are hitting a code path that is not IPv6 ready: lib/util/util/util_net.c: interpret_addr()

Still investigating 2nd and 3rd trace
Comment 7 Simo Sorce 2009-06-06 09:34:31 UTC
There are also other functions used by net that are not ipv6 ready :/

ads_cldap_netlogon() in source3/libads/cldap.c has a revealing comment:
/* TODO: support ipv6 */
Comment 8 Jeremy Allison 2009-06-08 14:16:27 UTC
Looks like open_udp_socket() is not IPv6 clean...
Jeremy.
Comment 9 Simo Sorce 2009-06-08 14:30:51 UTC
(In reply to comment #8)
> Looks like open_udp_socket() is not IPv6 clean...
> Jeremy.

It's more than that, it looks like all the client cldap code is ipv4 only.

Simo.
Comment 10 Jeremy Allison 2009-06-08 14:38:28 UTC
Created attachment 4263 [details]
Patch to make open_udp_socket() IPv6 clean.

This patch will make open_udp_socket(). It might be enough to fix the bug, if someone with a reproducible test setup could check I'd appreciate it.
Thanks !
Jeremy.
Comment 11 Jeremy Allison 2009-06-08 14:45:23 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > Looks like open_udp_socket() is not IPv6 clean...
> > Jeremy.
> 
> It's more than that, it looks like all the client cldap code is ipv4 only.

I don't see any other dependencies on IPv4 in the cldap code other than opening the udp socket (which I should just have fixed). All hostnames are strings in the cldap packet itself, so I'm hoping the patch I posted should be enough.

What other IPv4-only code do you see in libads/cldap.c ?

Jeremy.


Comment 12 Simo Sorce 2009-06-08 15:31:02 UTC
(In reply to comment #11)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > Looks like open_udp_socket() is not IPv6 clean...
> > > Jeremy.
> > 
> > It's more than that, it looks like all the client cldap code is ipv4 only.
> 
> I don't see any other dependencies on IPv4 in the cldap code other than opening
> the udp socket (which I should just have fixed). All hostnames are strings in
> the cldap packet itself, so I'm hoping the patch I posted should be enough.
> 
> What other IPv4-only code do you see in libads/cldap.c ?

I traced the fault to the use of:
source3/libads/cldap.c:ads_cldap_netlogon()

There interpret_addr2 is used and it is just a wrapper for:
lib/util/util_net.c:interpret_addr()

Which is an IPv4 only function as well.

In ads_cldap_netlogon() the function is_zero_ip_v4() also is quite revealing, as well as the following inet_ntop()

tsocket_address_inet_from_strings(mem_ctx, "ipv4", ... also quite explicit.

Simo.


This is an ipv4 only function.

Other code uses ipv4 only functions to resolve addresses.
Comment 13 Jeremy Allison 2009-06-08 15:37:16 UTC
Are we looking at the same code here ? I'm looking in the 3.4 ads_cldap_netlogon() function inside libads/cldap.c at line 243. I don't see any of the functions interpret_addr2 or is_zero_ip_v4() in that code at all ?

This is what it looks like in 3.4.x (below). The only function that needed fixing to be IPv6 clean that I could see is open_udp_socket(), which is what the patch addresses.

Jeremy.

239 /*******************************************************************
240   do a cldap netlogon query.  Always 389/udp
241 *******************************************************************/
242 
243 bool ads_cldap_netlogon(TALLOC_CTX *mem_ctx,
244                         const char *server,
245                         const char *realm,
246                         uint32_t nt_version,
247                         struct netlogon_samlogon_response **reply)
248 {
249         int sock;
250         int ret;
251 
252         sock = open_udp_socket(server, LDAP_PORT );
253         if (sock == -1) {
254                 DEBUG(2,("ads_cldap_netlogon: Failed to open udp socket to %s\n", 
255                          server));
256                 return False;
257         }
258 
259         ret = send_cldap_netlogon(mem_ctx, sock, realm, global_myname(), nt_version);
260         if (ret != 0) {
261                 close(sock);
262                 return False;
263         }
264         ret = recv_cldap_netlogon(mem_ctx, sock, nt_version, reply);
265         close(sock);
266 
267         if (ret == -1) {
268                 return False;
269         }
270 
271         return True;
272 }
Comment 14 Simo Sorce 2009-06-08 16:27:47 UTC
(In reply to comment #13)
> Are we looking at the same code here ? I'm looking in the 3.4
> ads_cldap_netlogon() function inside libads/cldap.c at line 243. I don't see
> any of the functions interpret_addr2 or is_zero_ip_v4() in that code at all ?
> 
> This is what it looks like in 3.4.x (below). The only function that needed
> fixing to be IPv6 clean that I could see is open_udp_socket(), which is what
> the patch addresses.

Oh!
I am sorry I was looking into master, I didn't know master was so different here. It seem that master has extensive regressions then.

Simo.
Comment 15 Jeremy Allison 2009-06-08 16:34:40 UTC
Ah indeed. Looks like this got messed up badly in master. I'll fix.
Jeremy.
Comment 16 Guenther Deschner 2009-06-18 10:43:54 UTC
Jeremy, any chance to look at this again ? ipv6 join would be really nice to have working again for 3.4...
Comment 17 Jeremy Allison 2009-06-18 11:33:22 UTC
The patch already posted here:

https://bugzilla.samba.org/attachment.cgi?id=4263

is for 3.4 and should fix it. Just needs someone with a test env to check it, but it's obvious goodness and should be in 3.4 (just fixes open_udp_socket() to be IPv5 clean).

Jeremy.
Comment 18 Chris Hills 2009-06-18 11:39:42 UTC
Thanks, I shall try this today.
Comment 19 Chris Hills 2009-06-18 13:00:30 UTC
Created attachment 4310 [details]
2nd try - Log from net -d 10 (no server specified)
Comment 20 Chris Hills 2009-06-18 13:01:03 UTC
Created attachment 4311 [details]
2nd try - Log from net -d 10 (server specified by name)
Comment 21 Chris Hills 2009-06-18 13:01:26 UTC
Created attachment 4312 [details]
2nd try - Log from net -d 10 (server specified by address)
Comment 22 Chris Hills 2009-06-18 13:02:09 UTC
Unfortunately I was still not able to join the domain with the patch.
Comment 23 Jeremy Allison 2009-06-18 18:47:36 UTC
Created attachment 4316 [details]
Patch to get more info.

The problem in your traces is that the reply from the recv_cldap_netlogon() isn't being seen - we always get the message "no reply received to cldap netlogon" printed. I need to know why. This patch will tell me if it was the select that failed, or the read that failed. Can you apply this to v3-4-test and repeat the experiment (and add the logs) please. What would also help is a wireshark capture trace from the box run at the same time also. It seems to be able to correctly connect to the DC using IPv6, but not get a reply from the UDP packet.
Jeremy.
Comment 24 Chris Hills 2009-06-19 08:12:52 UTC
Created attachment 4322 [details]
3rd try - Log from net -d 10 (no server specified)
Comment 25 Chris Hills 2009-06-19 08:13:21 UTC
Created attachment 4323 [details]
3rd try - Log from net -d 10 (server specified by name)
Comment 26 Chris Hills 2009-06-19 08:13:49 UTC
Created attachment 4324 [details]
3rd  try - Log from net -d 10 (server specified by address)
Comment 27 Chris Hills 2009-06-19 08:14:19 UTC
Created attachment 4325 [details]
Packet capture from client (no server specified)
Comment 28 Chris Hills 2009-06-19 08:14:46 UTC
Created attachment 4326 [details]
Packet capture from client (server specified by name)
Comment 29 Chris Hills 2009-06-19 08:15:11 UTC
Created attachment 4327 [details]
Packet capture from client (server specified by address)
Comment 30 Peter Grace 2009-06-19 08:19:45 UTC
Created attachment 4328 [details]
Packet capture from server (no server specified)
Comment 31 Peter Grace 2009-06-19 08:20:09 UTC
Created attachment 4329 [details]
Packet capture from server (server specified by name)
Comment 32 Peter Grace 2009-06-19 08:20:29 UTC
Created attachment 4330 [details]
Packet capture from server (server specified by address)
Comment 33 Jeremy Allison 2009-06-19 15:30:35 UTC
Ok, looking at the logs and packet capture traces, I'm seeing the same error in each of them:

[2009/06/19 14:14:26,  1] libads/cldap.c:166(recv_cldap_netlogon)
  no reply received to cldap netlogon (ret = -1: Error = Permission denied)

The requesting CLDAP packet is being sent out from the client (as seen by the client packet capture), but nothing is seen at the server.

Permission denied is a very strange error to be getting on a read call..

Do you have some sort of a firewall or SeLinux or AppArmour running that might block IPv6 udp packets on port 389 from getting out or being received ?

I can't see any errors in the Samba code that might cause this.

Jeremy.
Comment 34 Chris Hills 2009-06-19 17:50:03 UTC
I do have a firewall running on the client, but not selinux, apparmor or anything like that. The firewall allows all outbound connections without restriction, as well as established connections. Just in case, I tried explicitly adding the domain controller to the firewall, but that did not make any difference. If there's anything else I can try please let me know!
Comment 35 Jeremy Allison 2009-06-19 18:02:24 UTC
Can you try turning all firewalls off for the duration of the test please ? I don't understand what I'm seeing here. The packet doing the IPv6 UDP CLDAP query is seen by the client capture trace, but not seen on the server side, so no wonder it doesn't respond to it. I don't understand why that is. Is there a firewall running on the server ?
Jeremy.


Comment 36 Peter Grace 2009-06-19 21:59:22 UTC
(In reply to comment #35)
> Can you try turning all firewalls off for the duration of the test please ? I
> don't understand what I'm seeing here. The packet doing the IPv6 UDP CLDAP
> query is seen by the client capture trace, but not seen on the server side, so
> no wonder it doesn't respond to it. I don't understand why that is. Is there a
> firewall running on the server ?
> Jeremy.
> 

keim.vsix.me is behind a cisco firewall, I had unfortunately only had Chris's ip space open on tcp.  I have opened it for udp as well and we will retest.

Thanks!
Pete
Comment 37 Chris Hills 2009-06-20 03:42:43 UTC
Created attachment 4334 [details]
4th try - Log from net -d 10 (server specified)

This time the process got further, but unfortunately it still fails, with the message:-
Failed to join domain: failed to connect to AD: Server not found in Kerberos database
Comment 38 Kai Blin 2009-06-27 07:48:52 UTC
Jeremy, I've just tested the open_udp_socket() patch, and it fixes IPv6-only join of today's v3-4-test samba to an ipv6-only win2k8 domain controller. I did not see the "server not found in kerberos database" issue here.
Comment 39 Kai Blin 2009-06-27 08:11:56 UTC
Ok, I see a machine account on the DC, but net ads testjoin still fails with "Cannot contact any KDC for requested realm". smbd spits the same error message.
Comment 40 Jeremy Allison 2009-06-29 19:36:47 UTC
I think it's clear that the patches listed in this bugid are correct for 3.4.0, in that they correctly fix the IPv6 part of the problem.
Karolin, once Guenther reviews can you add this fix for 3.4.0.
Thanks,
Jeremy.

Comment 41 Guenther Deschner 2009-06-30 07:15:27 UTC
Patch (attachment id 4263) looks good.
Comment 42 Karolin Seeger 2009-07-18 01:27:39 UTC
Pushed "Patch to make open_udp_socket() IPv6 clean" to v3-4-test.

Jeremy, the "Patch to get more info" should be pushed, too?
Can we close the bug report after that?
Comment 43 Jeremy Allison 2009-07-20 11:55:57 UTC
Yes please push the "Patch to get more info" also as it makes errors comprehensible in the log.
Thanks !
Jeremy.
Comment 44 Karolin Seeger 2009-07-30 01:57:34 UTC
Done.
Closing out bug report.

Thanks!