Bug 8878 - libdns needs to timeout requests after a while
Summary: libdns needs to timeout requests after a while
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: samba4-qa@samba.org
URL:
Keywords:
Depends on:
Blocks: 8622
  Show dependency treegraph
 
Reported: 2012-04-19 07:19 UTC by Matthieu Patou
Modified: 2012-10-15 09:57 UTC (History)
1 user (show)

See Also:


Attachments
tcpdump trace (4.94 KB, application/octet-stream)
2012-04-19 07:19 UTC, Matthieu Patou
no flags Details
libcli/dns patch (3.46 KB, patch)
2012-10-13 00:11 UTC, Kai Blin
mat: review+
Details
More elegant path as proposed by Volker (1.14 KB, patch)
2012-10-14 10:28 UTC, Kai Blin
mat: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthieu Patou 2012-04-19 07:19:22 UTC
Created attachment 7467 [details]
tcpdump trace

The external DNS server that I'm using seems to drop DNS request time to time, when I use the forwarder of the internal server and the response is not received the dns server will be blocked forever.

Further analysis of the dns trace indicate that the internal DNS server receive the request but didn't show that the internal DNS server is emitting a dns request to the forwarder.
Comment 1 Matthieu Patou 2012-04-19 07:25:41 UTC
At packet 34 we can see that the internal DNS server is receiving the but nothing is transmitted.

Logs show the message: Not authorative for '67403d92-fa47-4954-993b-d75735a7e92c._msdcs.contoso.com', forwarding
Comment 2 Kai Blin 2012-04-19 10:23:41 UTC
So which IP is which system?
Comment 3 Matthieu Patou 2012-04-19 16:11:00 UTC
System had two nics with following IP:
192.168.93.108
172.16.100.1

resolv.conf configured the DNS server to be 172.16.100.1 and calls where coming from the system (hence source and dest ip are 172.16.100.1).
Comment 4 Kai Blin 2012-05-29 23:06:08 UTC
It looks like the packet sent out to the forwarder got dropped, and libdns doesn't cope with that yet. This needs to be fixed.
Comment 5 Matthieu Patou 2012-10-05 08:10:37 UTC
isn't it fixed by volker async stuff?
Comment 6 Matthieu Patou 2012-10-09 06:23:25 UTC
I got the feeling that what is reported in this email
[Samba] Internal DNS stops forwarding

Can be related if the forwarder never replies the descriptor associated to the udp socket is never freed after sometime we can easily exhaust the number of max file open.
Comment 7 Matthieu Patou 2012-10-11 21:25:14 UTC
Following remarks from Bob Cavey and Felix, I changed this bug as a blocker for 4.0.

It would be interesting to see how bind deals with this case when configured to use a forwarder and that the forwarder didn't reply.
Comment 8 Kai Blin 2012-10-13 00:11:28 UTC
Created attachment 8050 [details]
libcli/dns patch

Patch for the timeout.
Comment 9 Kai Blin 2012-10-13 00:14:55 UTC
Comment on attachment 8050 [details]
libcli/dns patch

Patch also applies to v4-0-test, so please ack and assign to Karolin if you're happy with it
Comment 10 Matthieu Patou 2012-10-13 05:24:02 UTC
The patch seems ok for me as for rc3 but please for master make the timeout configurable and maybe use a bigger timeout by default.
Comment 11 Matthieu Patou 2012-10-13 05:26:24 UTC
Karolin can you pick up kai's patch for the rc3 ?
Thanks.
Comment 12 Kai Blin 2012-10-14 10:28:42 UTC
Created attachment 8072 [details]
More elegant path as proposed by Volker

Volker just suggested a much better approach for this for master, and I think it makes sense to get the better version for v4-0-test as well.
Comment 13 Matthieu Patou 2012-10-14 23:28:22 UTC
Comment on attachment 8072 [details]
More elegant path as proposed by Volker

Seems ok indeed, I rely on the experience of Volker for async
Comment 14 Karolin Seeger 2012-10-15 09:57:27 UTC
Pushed to autobuild-v4-0-test.
Closing out bug report.

Thanks!