In a setup where a configured domain (in smb.conf) has an outbound trusted domain, it was observed that the user authentication, belonging to the outbound trusted domain, fails sometimes. The problem is that if winbindd child process attached to the outbound trusted domain goes offline (because the DC was unreachable for instance) then the function winbindd_can_contact_domain in winbindd_util.c prevents the domain to come back online.
I was working with Ravindra on this case. Basically the domain (let's say it's domain ROOT) in which Samba is a member has an outbound trusted connection to domain OUTTRUST. At the beginning everything is ok he can authenticate but when domain goes offline then due to the test we have in winbindd_can_contact_domain of winbindd_util.c it's impossible for the domain to go back online. My proposal would be to prevent outbound trust only domain to go offline. Any remarks ?
Ok, can you explain to me exactly what in winbindd_can_contact_domain() is not working right ? I'm missing the details :-). Thanks, Jeremy.
Jeremy so can_contact_domain has this test: if (!IS_DC && domain->active_directory && ((tdc->trust_flags & NETR_TRUST_FLAG_INBOUND) != NETR_TRUST_FLAG_INBOUND)) This will prevent domain with only outbound trust to go back online as winbindd_dual_pam_auth_crap calls cm_connect_netlogon which calls init_dc_connection_rpc which calls init_dc_connection_network which calls winbindd_can_contact_domain which returns false, this force the status to NT_STATUS_OK due to this code: if (!winbindd_can_contact_domain(domain)) { invalidate_cm_connection(&domain->conn); domain->initialized = True; return NT_STATUS_OK; } The domain won't be put back online and so pam_auth_crap will fail. Maybe we weren't clear, it's not that winbindd_can_contact_domain that is not working right, for me the bug is that we shouldn't put offline a domain that has only outbound trust as it can't be put back online. Jeremy does it make it more sense
Ok, I think we still need to put an outgoing-trust only domain offline if we can't contact it - otherwise winbindd will jam up trying to contact a missing set of DCs. I think the bug is that this statement: if (!IS_DC && domain->active_directory && ((tdc->trust_flags & NETR_TRUST_FLAG_INBOUND) != NETR_TRUST_FLAG_INBOUND)) isn't a complete set of reality. We need to cope with outgoing-only trusts also. Jeremy.
Created attachment 7704 [details] Patch for 3.5.x, 3.6.x and master Ok - I think this is the correct fix. By the time we've gotten to init_dc_connection_network() we shouldn't be second guessing the caller by calling winbindd_can_contact_domain() *AT ALL*. The fact a valid domain struct came into this function should be enough to force a connection attempt. Matthieu and Ravindra can you test this please ? Thanks ! Jeremy.
Comment on attachment 7704 [details] Patch for 3.5.x, 3.6.x and master Michael, please comment also.
(In reply to comment #5) > Created attachment 7704 [details] > Patch for 3.5.x, 3.6.x and master > > Ok - I think this is the correct fix. By the time we've gotten to > init_dc_connection_network() we shouldn't be second guessing the caller by > calling winbindd_can_contact_domain() *AT ALL*. > > The fact a valid domain struct came into this function should be enough to > force a connection attempt. > > Matthieu and Ravindra can you test this please ? > > Thanks ! > > Jeremy. I'm still on vacation somewhere in France with not too much will to look at it :-D, I'll be back around the 20th of July. Ravindra can you test or I'll have a look more deeply by the end of next week.
Earlier setup is disturbed; I will test out the patch as soon as I bring back the setup. But shouldn't winbindd_can_contact_domain() return true for outbound trust as well? Thanks!
No. The function is badly named. It really should be "winbindd_can_do_samr_operation_on_domain()" instead. It's actually checking if you're allowed to do SAMR query ops, not connection at all. Jeremy.
Comment on attachment 7704 [details] Patch for 3.5.x, 3.6.x and master ACK.
Re-assigning to Karolin for inclusion in 3.6.next and 3.5.next. Jeremy.
Pushed to v3-6-test and v3-5-test. Closing out bug report. Thanks!