Bug 13145 - Erronous "There is already a domain master browser" after IP change
Erronous "There is already a domain master browser" after IP change
Status: NEW
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Other
4.5.8
All All
: P5 normal
: ---
Assigned To: Andrew Bartlett
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-11-15 14:30 UTC by Martin von Wittich
Modified: 2017-11-15 14:30 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Wittich 2017-11-15 14:30:17 UTC
I just had a very confusing issue on a customer site where Windows computers were not able to join the domain of a newly set up Samba server. There were obvious errors in the syslog:

server ~ # grep 'There is already a domain master browser' /var/log/syslog | head
Nov 15 10:55:21 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:00:32 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:05:31 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:10:40 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:15:32 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:20:41 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:25:50 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:30:47 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:36:01 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.
Nov 15 11:40:59 iserv nmbd[9142]: There is already a domain master browser at IP 192.168.100.222 for workgroup SCHULE registered on subnet UNICAST_SUBNET.

Of course I initially suspected that there was a device 192.168.100.222 in the network that was causing the issue, so I tried to track it down with tshark/ping/arping, but to no avail. Confusingly, the IP 192.168.100.222 wouldn't even come up in tshark logs at all. I contemplated that it might come from some Samba cache, but unfortunately I dismissed this possibility too soon because it seemed very unlikely that requests such as these would be cached. In the end, I started to look into the code, and found out that Samba does in fact seem to use a cache:

1. The error message "There is already a domain master browser" is generated in become_domain_master_query_success() (source3/nmbd/nmbd_become_dmb.c), which is apparently a callback function for a query_name() call in become_domain_master_browser_bcast().

2. query_name() is defined in source3/nmbd/nmbd_namequery.c; it calls a function query_local_namelists() with the comment "We need to check our local namelists first", which does sound a bit like it might be reading from a file.

3. query_local_namelists() is defiend in source3/nmbd/nmbd_namequery.c; it calls a function find_name_on_subnet().

4. find_name_on_subnet() is defined in source3/nmbd/nmbd_namelistdb.c; it calls a function find_name_on_wins_subnet().

5. find_name_on_wins_subnet() is defined in source3/nmbd/nmbd_winsserver.c; it has this comment:

/****************************************************************************
 Lookup a given name in the wins.tdb and create a temporary malloc'ed data struct
 on the linked list. We will free this later in XXXX().
*****************************************************************************/

I'm not very familiar with the Samba source, so I'm not absolutely certain if my tracing is correct here, but I was finally able to resolve the issue on the customer site by stopping nmbd, deleting /var/lib/samba/wins.*, and then starting nmbd again.

As it turns out, the customer had installed the server in another network, and had it temporarily assigned the IP 192.168.100.222 during installation. When the server was deployed on the customer site, this IP was removed, but it survived in the WINS cache files:

server ~ # grep 222 wins.dat
"SCHULE#1b" 1511010435 10.0.0.1 192.168.100.222 64R
"ISERV#00" 1511011223 10.0.0.1 192.168.100.222 192.168.178.79 192.168.0.1 66R
"ISERV#03" 1511011223 10.0.0.1 192.168.100.222 192.168.178.79 192.168.0.1 66R
"ISERV#20" 1511011223 10.0.0.1 192.168.100.222 192.168.178.79 192.168.0.1 66R
"SCHULE#1c" 1511011223 10.0.0.1 192.168.100.222 192.168.178.79 192.168.0.1 e4R

server ~ # tdbdump wins.tdb | grep -B1 '\\DE'
key(65) = "SCHULE\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\1B"
data(39) = "d\00\01\830\10Z\E0E\0CZ\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\00\00\00\0A\00\00\01\C0\A8d\DE"
--
key(65) = "ISERV\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(47) = "f\00\01\B05\10Z\E0E\0CZ\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\04\00\00\00\0A\00\00\01\C0\A8d\DE\C0\A8\B2O\C0\A8\00\01"
--
key(65) = "ISERV\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\03"
data(47) = "f\00\01\B05\10Z\E0E\0CZ\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\04\00\00\00\0A\00\00\01\C0\A8d\DE\C0\A8\B2O\C0\A8\00\01"
--
key(65) = "ISERV\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00 "
data(47) = "f\00\01\B05\10Z\E0E\0CZ\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\04\00\00\00\0A\00\00\01\C0\A8d\DE\C0\A8\B2O\C0\A8\00\01"
--
key(65) = "SCHULE\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\1C"
data(47) = "\E4\00\01\B05\10Z\E0E\0CZ\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\04\00\00\00\0A\00\00\01\C0\A8d\DE\C0\A8\B2O\C0\A8\00\01"

I have found several discussions of this issue, but apparently it wasn't filed as a bug here yet:

https://forums.gentoo.org/viewtopic-t-726371.html
https://lists.samba.org/archive/samba/2005-February/100027.html
https://oioki.ru/2011/03/there-is-already-a-domain-master-browser/

I think this constitutes a bug in the Samba code; IMO the domain master browser lookup shouldn't use any cache at all.