Bug 4695 - get_dc_list does not failover when initially tried DC is down
Summary: get_dc_list does not failover when initially tried DC is down
Status: RESOLVED WORKSFORME
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: File Services (show other bugs)
Version: 3.0.24
Hardware: x86 Linux
: P3 major
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-13 16:37 UTC by Robin Battey
Modified: 2007-06-13 16:42 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Battey 2007-06-13 16:37:21 UTC
My environment had three win2ksp4 domain controllers, 10.111.0.50 (gc), 10.111.0.51 (primary), and 10.111.0.52 (no roles).  I am running samba 3.0.24-2ubuntu1.2 on ubuntu 6.10 edgy eft, using krb5-user 1.4.4-5ubuntu3 and libnss-ldap 1.4.4-5ubuntu3 for name/uid mapping using the SFU 3.5.

Here is my smb.conf global section:

 [global]
   realm = corp.uievolution.com
   workgroup = UIECORP
   server string = %h server (Samba, Ubuntu)
   dns proxy = no
   log file = /var/log/samba/log.%m
   max log size = 1000
   syslog = 0
   panic action = /usr/share/samba/panic-action %d
   security = ads
   password server = corp.uievolution.com
   encrypt passwords = true
   passdb backend = tdbsam
   obey pam restrictions = yes
   invalid users = root
   passwd program = /usr/bin/passwd %u
   passwd chat = *Enter\snew\sUNIX\spassword:* %n\n *Retype\snew\sUNIX\spassword
:* %n\n *password\supdated\ssuccessfully* .
   socket options = TCP_NODELAY
   log level = 7

Here is my /etc/krb5.conf:

 [logging]
    default = FILE:/var/log/krb5lib.log
    kdc = FILE:/var/log/krb5kdc.log
    admin_server = FILE:/var/log/kadmind.log
 [libdefaults]
    ticket_lifetime = 24000
    default_realm = CORP.UIEVOLUTION.COM
    dns_lookup_realm = false
    dns_lookup_kdc = false
    #default_tkt_enctypes = des3-hmac-sha1 des-cbc-crc
    #default_tgs_enctypes = des3-hmac-sha1 des-cbc-crc
 [realms]
 CORP.UIEVOLUTION.COM = {
     kdc = 10.111.0.50
     admin_server = 10.111.0.50
 }
 [domain_realm]
    .corp.uievolution.com = CORP.UIEVOLUTION.COM
    corp.uievolution.com = CORP.UIEVOLUTION.COM

This setup has been working flawlessly for a number of months.


The 10.111.0.52 server was turned off for unrelated reasons (hardware issues), and samba worked only sporadically from then on, for the next few days.  Turned log level up to 7 (I find myself unable to parse log level 10) and noted that it was attempting to get the dc list from the offline server:

[2007/06/13 13:21:30, 5] auth/auth_util.c:make_user_info_map(163)
  make_user_info_map: Mapping user [UIECORP]\[rbattey] from workstation [RBATTEY
LAPTOP]
[2007/06/13 13:21:30, 4] libsmb/namequery_dc.c:ads_dc_name(43)
  ads_dc_name: domain=UIECORP
[2007/06/13 13:21:30, 6] libads/ldap.c:ads_find_dc(217)
  ads_find_dc: looking for realm 'CORP.UIEVOLUTION.COM'
[2007/06/13 13:21:30, 5] libsmb/namecache.c:namecache_fetch(201)
  name corp.uievolution.com#20 found.
[2007/06/13 13:21:30, 4] libsmb/namequery.c:get_dc_list(1406)
  get_dc_list: returning 1 ip addresses in an ordered list
[2007/06/13 13:21:30, 4] libsmb/namequery.c:get_dc_list(1407)
  get_dc_list: 10.111.0.52:389 
[2007/06/13 13:21:30, 5] libads/ldap.c:ads_try_connect(126)
  ads_try_connect: trying ldap server '10.111.0.52' port 389
[2007/06/13 13:21:45, 4] passdb/secrets.c:secrets_fetch_trust_account_password(2
81)
  Using cleartext machine password
[2007/06/13 13:21:45, 5] libsmb/namecache.c:namecache_fetch(201)
  name corp.uievolution.com#20 found.
[2007/06/13 13:21:45, 4] libsmb/namequery.c:get_dc_list(1406)
  get_dc_list: returning 1 ip addresses in an ordered list
[2007/06/13 13:21:45, 4] libsmb/namequery.c:get_dc_list(1407)
  get_dc_list: 10.111.0.52:389 
[2007/06/13 13:21:45, 5] libsmb/namecache.c:namecache_status_fetch(308)
  namecache_status_fetch: no entry for NBT/UIECORP#1C.20.10.111.0.52 found.
[2007/06/13 13:21:45, 5] libsmb/nmblib.c:send_udp(777)
  Sending a packet of len 50 to (10.111.0.52) on port 137
[2007/06/13 13:21:47, 5] libsmb/nmblib.c:send_udp(777)
  Sending a packet of len 50 to (10.111.0.52) on port 137
[2007/06/13 13:21:49, 3] libsmb/trusts_util.c:enumerate_domain_trusts(161)
  enumerate_domain_trusts: can't locate a DC for domain UIECORP
[2007/06/13 13:21:49, 5] libsmb/trustdom_cache.c:trustdom_cache_fetch(184)
  no entry for trusted domain UIECORP found.

There are a number of similar entries.  After completely removing references to the offline server from Active Directory and DNS, things worked without problems:

[2007/06/13 13:44:56, 4] libsmb/namequery_dc.c:ads_dc_name(43)
  ads_dc_name: domain=UIECORP
[2007/06/13 13:44:56, 6] libads/ldap.c:ads_find_dc(217)
  ads_find_dc: looking for realm 'CORP.UIEVOLUTION.COM'
[2007/06/13 13:44:56, 5] libsmb/namecache.c:namecache_fetch(201)
  name corp.uievolution.com#20 found.
[2007/06/13 13:44:56, 4] libsmb/namequery.c:get_dc_list(1406)
  get_dc_list: returning 1 ip addresses in an ordered list
[2007/06/13 13:44:56, 4] libsmb/namequery.c:get_dc_list(1407)
  get_dc_list: 10.111.0.50:389 
[2007/06/13 13:44:56, 5] libads/ldap.c:ads_try_connect(126)
  ads_try_connect: trying ldap server '10.111.0.50' port 389
[2007/06/13 13:44:56, 3] libads/ldap.c:ads_connect(288)
  Connected to LDAP server 10.111.0.50
[2007/06/13 13:44:56, 3] libads/ldap.c:ads_server_info(2542)
  got ldap server name cwseadss001@CORP.UIEVOLUTION.COM, using bind path: dc=COR
P,dc=UIEVOLUTION,dc=COM
[2007/06/13 13:44:56, 4] libads/ldap.c:ads_server_info(2548)
  time offset is 0 seconds

I don't know how samba obtains the initial address to query (DNS, I expect), but I'm pretty certain it should gracefully fail over to another DC when the first DC ends up being unavailable.

Normally, I would crank the logging up to 10 and give full log sessions of both the working and failing DC configurations, but as this is a production environment, I don't have that luxury now that it's working once again.  However, here are the steps to reproduce:

* set up two windows 2000 domain controllers for a single domain
* set up samba with a similar config to the above
* check the samba logs to determine which DC it attempts to obtain the dc list from
* turn that DC off without demoting it first
Comment 1 Jeremy Allison 2007-06-13 16:39:08 UTC
We have put a lot of work into this in 3.0.25. May I suggest you try this release and see if you get the same problems ?

Thanks,

Jeremy.
Comment 2 Gerald (Jerry) Carter (dead mail address) 2007-06-13 16:42:09 UTC
and don't use "password server", let alone point it at a domain name.