Bug 8880 - Internal DNS didn't reply on A / AAAA requests when the database record is a CNAME
Internal DNS didn't reply on A / AAAA requests when the database record is a ...
Product: Samba 4.0
Classification: Unclassified
Component: Other
All All
: P5 normal
: ---
Assigned To: Kai Blin
Depends on:
  Show dependency treegraph
Reported: 2012-04-19 07:44 UTC by Matthieu Patou
Modified: 2012-06-06 14:10 UTC (History)
1 user (show)

See Also:

Proposed patch to fix (2.98 KB, patch)
2012-04-19 07:44 UTC, Matthieu Patou
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthieu Patou 2012-04-19 07:44:30 UTC
Created attachment 7468 [details]
Proposed patch to fix

Windows DNS and Bind dns do, and samba's replication code rely on this behavior when trying to get the IP address for <ntds_setting_objectguid>._msdcs.domain.tld.

What is happening is that samba sends a A / AAAA request for this DNS name, and has it's stored in the database as a CNAME record and so ignore the request.

It seems that the RFC indicate that both the CNAME and the A(AAA) record should be returned.
Comment 1 Matthieu Patou 2012-04-19 07:45:11 UTC
Patch tends to duplicate the code but it seems to please samba and windows dns clients.
Comment 2 Matthias Dieter Wallnöfer 2012-05-04 08:48:56 UTC
Reassign correctly.
Comment 3 Kai Blin 2012-05-04 11:57:24 UTC
As discussed on IRC, I'm not too happy with the patch.
Comment 4 Michael Adam 2012-05-31 19:10:02 UTC
For reference, here is the relevant part of 

04/18/12  8:55:04 <kai> morgen
04/18/12  8:56:24 <ekacnet> kai: hi 
04/18/12  8:57:07 <ekacnet> kai: http://cpaste.org/1450/
04/18/12  8:57:20 <kai> wasn't me... 0:)
04/18/12  8:57:43 <kai> hm?
04/18/12  8:57:57 <kai> I don't understand what that's trying to do
04/18/12  8:58:36 <kai> that certainly looks invalid
04/18/12  8:58:47 <kai> how are you going to handle SOA or PTR requests with that?
04/18/12  8:59:05 <kai> oh, wait
04/18/12  8:59:11 <kai> it's still invalid
04/18/12  8:59:40 <kai> what you want is to optionally return the AAAA record in the additional records section, possibly
04/18/12  8:59:47 <kai> but that's not the way to do it
04/18/12  9:00:05 <ekacnet> no what I want to do is that if I asked A or AAAA
04/18/12  9:00:16 <ekacnet> but the record is CNAME the server still reply 
04/18/12  9:00:43 <ekacnet> because microsoft do so, bind do so and because it breaks the replication code of samba to samba 
04/18/12  9:10:51 <kai> ekacnet: I'm not sure I understand the problem
04/18/12  9:12:15 <kai> ah, I see
04/18/12  9:12:26 <kai> but that's still not the right fix
04/18/12  9:13:30 <kai> and not part of RFC behaviour..
04/18/12  9:16:47 <ekacnet> kai: yeah wrong fix 
04/18/12  9:16:58 <ekacnet> ok for not RFC behavior 
04/18/12  9:17:35 <ekacnet> but it's bind and MS DNS behavior 
04/18/12  9:17:45 <ekacnet> and we rely on it at least for replication 
04/18/12  9:17:54 <kai> sure, but your resolver should be able to cope
04/18/12  9:18:22 <ekacnet> kai: no it doesn't 
04/18/12  9:19:22 <kai> ekacnet: oh, right, that's actually in rfc1034
04/18/12  9:19:33 <kai> but your patch doesn't fix the problem for all I can see
04/18/12  9:20:03 <ekacnet> http://cpaste.org/1451/
04/18/12  9:21:14 <kai> but if the record you're looking for with an A query is a CNAME there can't be another record with the same name that has another QTYPE
04/18/12  9:21:15 <ekacnet> as the time is limited solution 1) is to fall back to bind dlz 
04/18/12  9:21:20 <obnox> is the paste above the fix? or ist that s/th else?
04/18/12  9:21:36 <obnox> ekacnet: we can quickly build new packages.
04/18/12  9:21:39 <kai> obnox: no, at least not the right fix
04/18/12  9:21:41 <ekacnet> obnox: it's almost this 
04/18/12  9:21:57 <ekacnet> kai: well this is doing what I want now 
04/18/12  9:22:01 <ekacnet> at least it seems 
04/18/12  9:22:08 <obnox> ekacnet: but I guess you can also have the src deb so you can experiment more quickly
04/18/12  9:22:11 <kai> ok, then your database is in a weird state
04/18/12  9:22:21 <ekacnet> kai: fresh install 
04/18/12  9:23:02 <kai> "If a CNAME RR is present at a node, no other data should be
04/18/12  9:23:03 <kai> present"
04/18/12  9:23:08 <kai> says the RFC
04/18/12  9:23:11 <ekacnet> http://cpaste.org/1452/
04/18/12  9:23:21 <ekacnet> there is no other data 
04/18/12  9:23:57 <ekacnet> I just want that if you have a CNAME record but I request a A or AAAA record of the same name you return me either a A or CNAME 
04/18/12  9:24:05 <kai> right
04/18/12  9:24:12 <kai> but that's not valid behavior
04/18/12  9:24:39 <kai> you're supposed to re-run the query for the A record (and presumably the AAAA record) and then return that along with the CNAME
04/18/12  9:24:40 <ekacnet> kai: well that's bind behavior 
04/18/12  9:24:59 <gladiac> abartlet: hi, I'm here
04/18/12  9:25:02 <ekacnet> or MS DNS behavior 
04/18/12  9:25:10 <kai> it's not MS DNS behavior
04/18/12  9:25:22 <kai> look, hack it in if you need it, but it's not the right fix
04/18/12  9:25:45 <kai> I agree there's a bug in the internal DNS, but that's not the way to fix it
04/18/12  9:26:42 <ekacnet> http://cpaste.org/1453/
04/18/12  9:26:46 <ekacnet> microsoft behavior 
04/18/12  9:27:40 <ekacnet> urg they also return the A record 
04/18/12  9:27:49 <kai> network trace or I don't believe it ;)
04/18/12  9:28:14 <kai> you're supposed to return the CNAME and the A record of the node the CNAME points at
04/18/12  9:28:31 <kai> possibly your resolver library does that for you if you run "host"
04/18/12  9:29:02 <kai> but if you run wireshark, I don't see how your patch would make the internal server return the correct query
04/18/12  9:29:24 <kai> er, the A record of the node CNAME points at
04/18/12  9:29:44 <ekacnet> kai: sure I think I just return the CNAME record 
04/18/12  9:29:53 <kai> right, and that's invalid, too
04/18/12  9:30:30 <kai> if it works as a quick fix, hack it locally
04/18/12  9:30:36 <kai> but it's not the right fix
04/18/12  9:30:57 <ekacnet> but if I ask for CNAME the DNS is supposed to return me just the CNAME ? 
04/18/12  9:31:07 <kai> yes
04/18/12  9:31:19 <kai> that's page 14 on RFC1034
04/18/12  9:34:33 <ekacnet> you don't have a more cleaner fix 
04/18/12  9:34:42 <ekacnet> out of your hat ? 
04/18/12  9:35:01 <kai> no, it's not trivial
04/18/12  9:35:50 <kai> you need to check if it's a CNAME, and if it is, look at the node pointed at, and rerun the query for an A (and possibly AAAA) record
04/18/12  9:36:01 <kai> though the RFC only mentions A records
04/18/12  9:36:25 <kai> and I'm at work and don't have a DNS with AAAA records to play with
04/18/12  9:36:29 <ekacnet> well it seems that bind is return A if you asked for A 
04/18/12  9:36:41 <ekacnet> and AAAA if you asked for AAAA 
04/18/12  9:36:55 <kai> ah, that makes life easier
04/18/12  9:37:00 <ekacnet> a CNAME record can return 1 alias no ? 
04/18/12  9:37:05 <kai> and of course that makes sense
04/18/12  9:37:06 <kai> yes
04/18/12  9:38:08 <kai> though I wonder if you're allowed to be authorative for foo.example.com only and have alias.foo.example.com be a CNAME for host.bar.example.com
04/18/12  9:38:29 <kai> in which case you'd need to fire off a recursion for that name to get the A(AAA) record
04/18/12  9:39:02 <kai> or possibly not, if recursion is disabled
04/18/12  9:39:29 <kai> in that case, I assume you need to return a SERVER_ERROR, which seems to be what bind does when it can't talk to forwarders
04/18/12  9:40:49 <ekacnet> well the simple solution is that if we can't resolv we just return SERVER_ERRROR ?
04/18/12  9:41:25 <kai> I don't know, I'd have to test this
04/18/12  9:48:02 <ekacnet> so in AD if you have foo.samba.org that is a CNAME for bar.samba.org 
04/18/12  9:48:14 <ekacnet> but bar.samba.org didn't exists you just get the CNAME
04/18/12  9:55:40 <ekacnet> kai: is there a reason why internal dns didn't bind on lo0 ? 
04/18/12 10:04:58 <kai> ekacnet: probably somebody futzed with the network setup code and I didn't notice ;)
04/18/12 10:05:09 <kai> it was cleanly stolen from the kdc code
04/18/12 10:05:29 <ekacnet> :-)
04/18/12 10:05:42 <ekacnet> well i'll figure out somehow 
04/18/12 10:06:35 <kai> just use your local IP :)
04/18/12 10:09:08 <ekacnet> kai: by implementing this CNAME recursion I really have the impression to duplicate the code of handle_question
04/18/12 10:12:35 <kai> ekacnet: yes, I think to fix this, I think it's time to switch dns_process() to be event-based
04/18/12 10:13:27 <kai> then on a CNAME, you run another query
04/18/12 10:14:10 <kai> makes for a much nicer overall structure of the code, and also will stop a stalling forwarder from stalling the whole dns process
04/18/12 10:14:24 <obnox> kai: good plan
04/18/12 10:16:14 <kai> that's been on the todo list for a while, but this bug seems to call for a bump
04/18/12 10:17:20 <ekacnet> kai: well excuse me for tonight I'll have the partial code duplication solution :-)
04/18/12 10:19:38 <kai> whatever floats your boat
04/18/12 10:19:56 <ekacnet> +1 
Comment 5 Michael Adam 2012-05-31 19:11:16 UTC
Kai: Now that the code has been made async,
will it be more easy to do the proper fix?

With a little more input, I would try to work on
some code if you don't have the ressources, currently.
Maybe we can discuss this also on irc...

Cheers - Michael
Comment 6 Kai Blin 2012-06-06 14:10:55 UTC
Fixed by commit f3df2988ba6928cde0bd89da321bbe74fd76f53f