Bug 7802 - cannot join machines to AD anymore, machine account is not created
cannot join machines to AD anymore, machine account is not created
Status: RESOLVED FIXED
Product: Samba 4.0
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB
unspecified
x86 Linux
: P3 major
: ---
Assigned To: Andrew Bartlett
samba4-qa@samba.org
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-11-19 09:21 UTC by Robert Clauff
Modified: 2010-12-15 18:10 UTC (History)
1 user (show)

See Also:


Attachments
dump of xp machine attempting to join DC (66.13 KB, application/octet-stream)
2010-11-19 09:31 UTC, Robert Clauff
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Clauff 2010-11-19 09:21:10 UTC
I have been having the smb4 running as our DC for awhile now and everything has been fine joining up until last night.  We had a new employee start and I joined his PC to the domain and it gave me an ackknowledgment of joining, but when I restarted and tried to log in it said that it couldn't find the username.  When looking at the logs it told me it couldn't find the computer account. When I looked back at the DC the machine account wasn't created.  I have done this a few different ways.  I also tried creating the account with the UUID included and it told me that it couldn't because there was already an account with that name, yet its nowhere to be found.  I will do a capture of the join and post momentarily.
Comment 1 Robert Clauff 2010-11-19 09:30:36 UTC
I will also tell you that this PC has already been on the domain before with no problems.  I have attached a capture of the initial join as well as the attempted login after the restart.
Comment 2 Robert Clauff 2010-11-19 09:31:47 UTC
Created attachment 6076 [details]
dump of xp machine attempting to join DC
Comment 3 Robert Clauff 2010-11-19 11:01:59 UTC
So I came to a resolution to the problem already.  I had an incling so I checked the backupDC which also runs samba4.  Well come to find out when I removed the machine account on the primary it didn't do that for the backup.  Once I removed them off of the backup DC I could rejoin the PCs to the domain.  That being said I was wondering if there is some bugs in the replication process still because if so I will disable the backupDC.  The backup isn't really that intense and should just keep replicating the primary, but I feel that something isn't quite right about it.  Any thoughts?
Comment 4 Matthias Dieter Wallnöfer 2010-11-19 13:02:51 UTC
Well, but is then your problem about replication? If yes, it would be appreciated if you could provide us with detailed information where the breakage started to happen.
Comment 5 Robert Clauff 2010-11-23 10:49:02 UTC
After awhile it will almost seem to timeout as it will give replication errors, but then if I restart it will start the replication process again.  Once restarted it starts to fail to remove objects that have been deleted.  I beleive something is not quite right with the replication process on this box.  Let me know if you want the logs.
Comment 6 Matthias Dieter Wallnöfer 2010-11-29 03:50:51 UTC
abartlet,

do you think it's helpful to get the replication logs?
Comment 7 Matthias Dieter Wallnöfer 2010-12-03 04:07:51 UTC
I've talked to tridge (DRS developer) and he says that the outputs from "samba-tool drs showrepl MACHINE" on both DCs would be appreciated.
Comment 8 Robert Clauff 2010-12-03 13:13:49 UTC
I do not have the samba-tool in my version.  Is that just something separate that I can install by itself or should I wait til after I upgrade provision next week?


> I've talked to tridge (DRS developer) and he says that the outputs from
> "samba-tool drs showrepl MACHINE" on both DCs would be appreciated.
> 

Comment 9 Matthias Dieter Wallnöfer 2010-12-03 14:19:07 UTC
It's probably still called "net" in your release (it has been renamed afterwards).

So simply try "net drs showrepl <MACHINENAME>". The "net" tool should be located in the source tree under "source4/bin" or under "/usr/local/samba/bin".
Comment 10 Robert Clauff 2010-12-03 14:46:26 UTC
For identification purposes my PDC is "thesun" and my BDC is "themoon"
These are the replies I get. From the PDC I get:

Password for [administrator@CAS-ONLINE.COM]:
Default-First-Site-Name\THEMOON
DSA Options: (none)
Site Options: (none)
DSA object GUID: 7841bea0-82aa-4a83-a397-77c358ce01bc
DSA invocationID: 79c26bb6-f646-431d-84a3-f8d323bcc3c2

==== INBOUND NEIGHBORS ====

CN=Configuration,DC=cas-online,DC=com
	Default-First-Site-Name\THESUN via RPC
		DSA object GUID: 67ff4849-bce6-4702-bc1f-1c6738362630
		Last attempt @ Fri Dec  3 14:33:32 2010 CST was successful.
		0 consecutive failure(s).
		Last success @ Fri Dec  3 14:33:32 2010 CST

DC=cas-online,DC=com
	Default-First-Site-Name\THESUN via RPC
		DSA object GUID: 67ff4849-bce6-4702-bc1f-1c6738362630
		Last attempt @ Fri Dec  3 14:33:34 2010 CST was successful.
		0 consecutive failure(s).
		Last success @ Fri Dec  3 14:33:34 2010 CST

CN=Schema,CN=Configuration,DC=cas-online,DC=com
	Default-First-Site-Name\THESUN via RPC
		DSA object GUID: 67ff4849-bce6-4702-bc1f-1c6738362630
		Last attempt @ Fri Dec  3 14:33:38 2010 CST was successful.
		0 consecutive failure(s).
		Last success @ Fri Dec  3 14:33:38 2010 CST

==== OUTBOUND NEIGHBORS ====
DsReplicaGetInfo failed - DCERPC fault 0x000006d8.
return code = -1
DsReplicaGetInfo() failed for DRSUAPI_DS_REPLICA_INFO_KCC_DSA_CONNECT_FAILURES


From the BDC I get :

Password for [CASINC\root]:
Failed to bind to uuid e3514235-4b06-11d1-ab04-00c04fc2dcd2 - NT_STATUS_NET_WRITE_FAULT
Failed to connect to server: NT_STATUS_NET_WRITE_FAULT
return code = -1

As you can see it has obviously synced successfully.  Why is it prompting me for a different username on the BDC than the PDC?

(In reply to comment #9)
> It's probably still called "net" in your release (it has been renamed
> afterwards).
> 
> So simply try "net drs showrepl <MACHINENAME>". The "net" tool should be
> located in the source tree under "source4/bin" or under "/usr/local/samba/bin".
> 

Comment 11 Matthias Dieter Wallnöfer 2010-12-03 14:50:08 UTC
Well, I'm not expert in such questions - but I will get tridge to comment on it.
Comment 12 Robert Clauff 2010-12-08 16:52:43 UTC
Ok, so we had another predicament today. First off noone could connect to their samba shares on our fileserver until I finally restarted samba on our PDC. The fileserver looks at the PDC as a password server, but all the accounts are managed locally. Not sure about that one. Is there a configuration in samba 4 that you need to have to accept the fileserver password query for samba 3?

Thats the start of the morning and then later on I had people not be able to connect again. I restarted samba on the fileserver and on samba.  I restarted one of the XP boxes that I was working on and low and behold it couldn't find the DC.  When I examined the AD it showed that the PC wasn't there, thinking back to the BDC issue, thinking that it somehow removed it.  So I went back to the BDC which I thought I stopped but I am thinking I left it running while we worked this out, but I finally stopped samba and shut it down.  How to I stop the PDC from replicating or wanting to replicate.  I want to fix the issues on my PDC before I deal with these issues.  Plus I need to stop the replication because I am probably going to do a upgrade provision in the next couple of days.

I know its a lot but it would halp me out greatly.  If you need to look at the samba logs for the fileserver or the PDC let me know and I can attach them.
Comment 13 Andrew Bartlett 2010-12-08 18:53:51 UTC
As this is more 'technical support' rather than a specific identifieable bug, I think you might get more help on the samba-technical list.

In particular, we seem to have a number of issues, but you should start by running the most current version of Samba from GIT.  (You don't mention what version you are running). 

Also, when using samba-tool or net, specify the username with -Uadministrator, so you don't get the failure to find a root user (there is none). 

Unless your provision is particularly old, upgradeprovision should not be required to move to a new version of Samba4. 

Then, when debugging what issues still occur, please take network captures and show the level 10 logs of the failure.  Do the things one at a time (fixing replication, or showing the logon failure).   The more detail the better normally, as otherwise we will waste round-trips asking for it. 

Also, stop the BDC entirely.  Debug the logon failure with only one server, which you know contains that machine account. 

Andrew Bartlett
Comment 14 Robert Clauff 2010-12-15 18:10:20 UTC
The original problem was resolved on this bug so I am closing this one out.