Found here @ Google. Bugs in the resolve_ads() code can cause this function to return null addresses, and also to return a limited list of valid IP addresses given a DC name. Fixes for 3.5.x and 3.6.x to follow, once the patch has gone through autobuild into master. Jeremy.
Created attachment 7510 [details] git-am fix for 3.6.x. Michael, please review for 3.6.next. This is an important one as it can cause invalid IP addresses to be used when looking up DC names. Thanks, Jeremy.
Created attachment 7511 [details] git-am fix for 3.5.x.
Reporter confirms this patch fixes the issue on site. Jeremy.
Comment on attachment 7510 [details] git-am fix for 3.6.x. Trying to get some attention for this fix :-).
Jeremy, I WILL ack it once you gave me the test vectors I asked for offline.
Understood - I just didn't want to forget this for release.
Info from original reporter (names and IP addresses changed to protect internal network): resolve_ads: Attempting to resolve DCs for XX.YY.FOO.BAR.COM using DNS [2012/04/27 16:03:11.805623, 4] libads/dns.c:432(ads_dns_lookup_srv) ads_dns_lookup_srv: 14 records returned in the answer section. [2012/04/27 16:03:16.367389, 5] libsmb/namecache.c:106(namecache_store) namecache_store: storing 17 addresses for XX.YY.FOO.BAR.COM#1c: 1.2.3.1,1.2.3.2,1.2.3.3,1.2.3.4, 1001:1111:1111:1000:0:1111:1111:1111],1.2.3.5,1.2.3.6,1.2.3.7,1.2.3.8,1.2.3.9,1.2.3.10,1.2.3.11,1.2.3.12 [2012/04/27 16:03:16.368613, 4] libsmb/namequery.c:2054(get_dc_list) get_dc_list: returning 17 ip addresses in an ordered list [2012/04/27 16:03:16.368652, 4] libsmb/namequery.c:2055(get_dc_list) get_dc_list: 1.2.3.1:389 1.2.3.2:389 1.2.3.3:389 1.2.3.4:389 1.2.3.5:389 1.2.3.6:389 1.2.3.7:389 1.2.3.8:389 1.2.3.9:389 1.2.3.10:389 1.2.3.11:389 1.2.3.12:389 :0 :0 1001:1111:1111:1000:0:1111:1111:1111:389 :0 :0 This shows the position of the added zeros by the flawed code. I can construct a test vector from this.
*** Bug 8893 has been marked as a duplicate of this bug. ***
FYI. Patch to test remove_duplicate_addrs2() pushed to autobuild in master. It processes a set of test vectors interspaced with duplicates and zero addresses to a single list. I'll post the git ref once it's passed autobuild. Jeremy.
Ok, git commit c531aac27c433e0eb068a8a4f0a6c90cad2e44fa in master adds the torture test LOCAL-remove_duplicate_addrs2. Jeremy.
Comment on attachment 7510 [details] git-am fix for 3.6.x. Ok, now the test is in master - can I get an ack for this fix for 3.6.next please ? Jeremy.
I fully believe you that the new code is right. Sorry to benit-picking, but for me the old code also survives that test. My wish would have been to get a test vector that fails against the old code and that succeeds against the new code. If you are telling me this is impossible to get, I will just ack this code,just to get this off my back.
Hmmm. I don't have a copy of the exact return values from the internal network that show the old code failing but new code succeeding. I'm not saying I can't get it, I'd have to get them to put the old version back with additional debugs and re-write the test case, but at this point (and as the new code is working and has been demonstrated as such on 2 different sites) I think it's good to go. Just my 2 cents. Jeremy.
I could look at the logic of the old and new code and construct a test vector that explicitly will fail against the old code based on the understanding I gained in writing the new. Would that do ? Jeremy.
Karolin please push for 3.5.next and 3.6.next - thanks ! Jeremy.
Pushed to v3-5-test and v3-6-test. Closing out bug report. Thanks!
We've got a problem with this "fix" locally, net join can't find domain controllers. :( 79658018ec73755bbd495963f977af61b4497bfb is the bad commit, for us. I'm looking at why. A reproduction of the other case would help me make sure I don't "break it the other way." At the least, count me as - on the review ;). -Ira
Is it possible that you add DEBUGs to the code to extract the before/after arrays of IPs, just to have another test vector?
Thanks for testing Ira. All other people reporting the fix worked were on Linux. I'll examine as a priority. Jeremy.
The loop unrolling fix in 79658018ec73755bbd495963f977af61b4497bfb is the one most likely to have differences between Linux and Solaris. This isn't the code that modifies the removal of extra zero addresses (thank goodness) so it isn't a problem in the memcpy/array manipulation code - this is new functionality when we only have a name but no IP address. Ira and I are working on this issue on IRC. More details when we have them. Jeremy.
Created attachment 7609 [details] Fix for 3.6. This fixes the loop initialization for 3.6.
Created attachment 7610 [details] Modified fix. I think this is what it should be... Jeremy.
Comment on attachment 7609 [details] Fix for 3.6. Not quite enough. Modified fix below.
Ah - now I know why this passed tests @ Google and @ Dell. They were using the git-am fix for 3.5.x, which correctly uses a for() loop construct here. I screwed this up when forward porting to 3.6.x (and master). Jeremy.
Just goes to show there's no alternative to good old fashioned testing :-). The torture test couldn't have caught that as this problem depends on specific returns from a DNS server, which we don't do in make test. Plus the core array modification code was correct, it was iterating over the array that was in error. Jeremy.
Created attachment 7611 [details] Additional fix for 3.6.next. Additional fix that went into master because I screwed up the forward port of the 3.5.x fix (which was tested on multiple sites) to the v3.6.x fix. Ira has now tested this. Ira please review. This additional fix is needed for v3-6-test but *NOT* for v3-5-test. Thanks, Jeremy.
Re-assigning to Karolin for inclusion in 3.6.next. Karolin, please apply attachment 7611 [details] to v3-6-test, but not to v3-5-test. Thanks, and sorry for the mistake ! Jeremy.
(In reply to comment #27) > Re-assigning to Karolin for inclusion in 3.6.next. > > Karolin, please apply attachment 7611 [details] to v3-6-test, but not to v3-5-test. > > Thanks, and sorry for the mistake ! > > Jeremy. Pushed to v3-6-test. Closing out bug report. Thanks!