Bug 4014 - Samba 3.0.23[ab] on SLES8 unable to parse list of DC returned from DNS
Summary: Samba 3.0.23[ab] on SLES8 unable to parse list of DC returned from DNS
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: winbind (show other bugs)
Version: 3.0.23b
Hardware: x86 Linux
: P3 critical
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-10 21:05 UTC by William Charles
Modified: 2011-03-21 19:36 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description William Charles 2006-08-10 21:05:59 UTC
Jerry, I have recompiled 3.0.23b, with full debugging info, on SuSE Linux Enterprise Server 8. I have stepped through a 'net ads info' command within GDB, and now know where the code is bombing-out. Well, we knew that before as the logfile told us; anyway...

[2006/08/09 15:34:45, 5] libsmb/namequery.c:resolve_ads(1037)
  resolve_hosts: Attempting to resolve DC's for xxx.yyy.zzz.com using DNS
[2006/08/09 15:34:45, 4] libads/dns.c:ads_dns_lookup_srv(286)
  ads_dns_lookup_srv: 21 records returned in the answer section.
[2006/08/09 15:34:45, 1] libads/dns.c:ads_dns_parse_rr_srv(173)
  ads_dns_parse_rr_srv: Failed to parse RR record
[2006/08/09 15:34:45, 1] libads/dns.c:ads_dns_lookup_srv(313)
  ads_dns_lookup_srv: Failed to parse answer record!


It seems that we correctly get 21 domain controller records returned for my global AD domain. When trawling through the returned answers in ads_dns_lookup_srv() we eventually seem to overflow some memory bounds?

Anyway, we start here in dns.c:ads_dns_lookup_srv():

    303         /* now we are at the answer section */
    304
    305         for ( rrnum=0; rrnum<answer_count; rrnum++ ) {
    306                 if ( !ads_dns_parse_rr_srv( ctx, buffer, buffer+resp_len, &p, &dcs[rrnum] ) ) {
    307                         DEBUG(1,("ads_dns_lookup_srv: Failed to parse answer record!\n"));
    308                         return NT_STATUS_UNSUCCESSFUL;
    309                 }
    310         }


We manage to loop over ten or so DNS answers, before function ads_dns_parse_rr_srv() calls ads_dns_parse_rr() and fails. The latter function is returning 'False' here:

    115         /* pull the name from the answer */
    116
    117         namelen = dn_expand( start, end, p, hostname, sizeof(hostname) );
    118         if ( namelen < 0 ) {
    119                 return -1;
    120         }
    121         p += namelen;
    122         rr->hostname = talloc_strdup( ctx, hostname );
    123
    124         /* check that we have space remaining */
    125
    126         if ( PTR_DIFF(p+10, end) > 0 )
    127                 return False;


I will dig deeper to try find the root cause. This function does seem to operate correctly on Solaris 8 with all 21 DC parsed correctly...
Comment 1 William Charles 2006-08-10 23:30:16 UTC
I think that I may have found the problem. On Solaris, the res_query() function can (and does) return an answer length bigger than the buffer supplied; and you're supposed to retry with a bigger buffer. Your code does just that.

But, on my SLES server the res_query() is returning 512 -- the exact same size as the buffer supplied; not bigger as per Solaris. I wonder if, in that case, one should retry with a larger buffer? Trouble is, your code bases the new buffer size on the value previously returned by res_query() and in this Linux case we have no hint as to how much we've potentially overflowed by...
Comment 2 William Charles 2006-08-11 00:07:26 UTC
This fixes the issue for me!


# diff -urN libads/dns.c libads/dns.c.orig
--- libads/dns.c        2006-08-11 14:50:14.000000000 +1000
+++ libads/dns.c.orig   2006-08-11 14:45:53.000000000 +1000
@@ -230,7 +230,7 @@
 NTSTATUS ads_dns_lookup_srv( TALLOC_CTX *ctx, const char *name, struct dns_rr_srv **dclist, int *numdcs )
 {
        uint8 *buffer = NULL;
-       size_t buf_len = 0;
+       size_t buf_len;
        int resp_len = NS_PACKETSZ;
        struct dns_rr_srv *dcs = NULL;
        int query_count, answer_count, auth_count, additional_count;
@@ -249,7 +249,7 @@
                if ( buffer )
                        TALLOC_FREE( buffer );

-               buf_len += ( NS_PACKETSZ * sizeof(uint8) );
+               buf_len = resp_len * sizeof(uint8);

                if ( (buffer = TALLOC_ARRAY(ctx, uint8, buf_len)) == NULL ) {
                        DEBUG(0,("ads_dns_lookup_srv: talloc() failed!\n"));
@@ -261,7 +261,7 @@
                        TALLOC_FREE( buffer );
                        return NT_STATUS_UNSUCCESSFUL;
                }
-       } while ( buf_len <= resp_len && resp_len < MAX_DNS_PACKET_SIZE );
+       } while ( buf_len < resp_len && resp_len < MAX_DNS_PACKET_SIZE );

        p = buffer;
Comment 3 William Charles 2006-08-11 00:22:24 UTC
This is tad cleaner? And I have the files in the correct order this time!...


# diff -urN libads/dns.c.orig libads/dns.c
--- libads/dns.c.orig   2006-08-11 14:45:53.000000000 +1000
+++ libads/dns.c        2006-08-11 15:13:28.000000000 +1000
@@ -230,8 +230,8 @@
 NTSTATUS ads_dns_lookup_srv( TALLOC_CTX *ctx, const char *name, struct dns_rr_srv **dclist, int *numdcs )
 {
        uint8 *buffer = NULL;
-       size_t buf_len;
-       int resp_len = NS_PACKETSZ;
+       size_t buf_len = NS_PACKETSZ;
+       int resp_len;
        struct dns_rr_srv *dcs = NULL;
        int query_count, answer_count, auth_count, additional_count;
        uint8 *p = buffer;
@@ -246,10 +246,10 @@
           of large replies */

        do {
-               if ( buffer )
+               if ( buffer ) {
                        TALLOC_FREE( buffer );
-
-               buf_len = resp_len * sizeof(uint8);
+                       buf_len += ( NS_PACKETSZ * sizeof(uint8) );
+               }

                if ( (buffer = TALLOC_ARRAY(ctx, uint8, buf_len)) == NULL ) {
                        DEBUG(0,("ads_dns_lookup_srv: talloc() failed!\n"));
@@ -261,7 +261,7 @@
                        TALLOC_FREE( buffer );
                        return NT_STATUS_UNSUCCESSFUL;
                }
-       } while ( buf_len < resp_len && resp_len < MAX_DNS_PACKET_SIZE );
+       } while ( buf_len <= resp_len && resp_len < MAX_DNS_PACKET_SIZE );

        p = buffer;
Comment 4 William Charles 2006-08-30 00:48:46 UTC
I just looked at the source code RPM for SLES8's GLIBC... Interesting comment in the 'resolv/res_send.c' file:

        if (truncating) {
                /*
                 * Flush rest of answer so connection stays in synch.
                 */
                anhp->tc = 1;
                len = resplen - anssiz;
                while (len != 0) {
                        char junk[PACKETSZ];

                        n = read(statp->_vcsock, junk,
                                 (len > sizeof junk) ? sizeof junk : len);
                        if (n > 0)
                                len -= n;
                        else
                                break;
                }
                /* Dirty fix - avoid read buffer overruns in applications
                 * that naively assume the length returned by res_*
                 * is always less than or equal the answer buffer size.
                 * Simply truncating the answer here surely beats fixing
                 * all calls of res_* in all applications
                 */
                resplen = anssiz;
        }


This is 'glibc-2.2.5', and I'd suggest that the 'Dirty Fix' above is to blame for the DNS oddities described previously. I guess that someone at SuSE (of old) may have had some part to play?!... Indeed, there's a file called 'glibc-2.2-resolver2.patch' included in the source RPM. Guess the contents?!...

--- resolv/res_send.c   Fri Jan  5 03:35:42 2001
+++ resolv/res_send.c   Thu Sep 26 09:59:59 2002
@@ -805,6 +805,13 @@
                        else
                                break;
                }
+               /* Dirty fix - avoid read buffer overruns in applications
+                * that naively assume the length returned by res_*
+                * is always less than or equal the answer buffer size.
+                * Simply truncating the answer here surely beats fixing
+                * all calls of res_* in all applications
+                */
+               resplen = anssiz;
        }


Anyway, I'll leave it your good self to decide on the proper course of action. I'm loath to suggest a patch to Samba for what is essentially a SuSE specific oddity? Or maybe other's do the same thing? I doubt it? Anyway, it's a fairly ancient very of Linux too...
Comment 5 Spammer 2007-04-28 04:24:42 UTC
Created attachment 2503
Comment 6 Björn Jacke 2011-03-21 19:36:46 UTC
just an "ancient suse specific bug" :-) ... closing out for us.
Thanks for your research anyhow,