Bug 11769 - net ads join -k kerberos authentication is not site-aware
Summary: net ads join -k kerberos authentication is not site-aware
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Tools (show other bugs)
Version: 4.3.5
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks: 11975
  Show dependency treegraph
 
Reported: 2016-03-02 23:05 UTC by Uri Simchoni
Modified: 2016-06-22 11:05 UTC (History)
5 users (show)

See Also:


Attachments
git-am fix for 4.4.0 and 4.3.next (5.52 KB, patch)
2016-03-08 20:41 UTC, Uri Simchoni
uri: review? (jra)
uri: review? (gd)
asn: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Uri Simchoni 2016-03-02 23:05:22 UTC
An AD member is joined to a domain using the "net ads join" command, and adding the "-k" switch causes all authentication to use Kerberos. The documented way of doing this is prior to running winbindd, hence the winbindd Kerberos locator is not operational at this stage. As a result, the process of finding a KDC is not site-aware, and an off-site KDC can be contacted.

The process of finding a DC for creating the machine account (via SMB/ldap) *is* site-aware, so once there's a service ticket to that DC, everything continues in a site-aware manner.

At first glance this does not appear to be a significant issue, since joining the domain is a one-time operation. However, the site-unaware operation sometimes prolongs the ticket acquisition up to a point of failing the whole operation.

It appears to be customary in some enterprises to block (drop) communication between sites, so while off-site DCs appear in DNS records, they are not reachable. A UDP Kerberos handshake would fail after a few seconds (depends on Kerberos libs), and a TCP handshake would take longer to fail because the typical OS TCP timeout if SYN packets are dropped is ~15 seconds.

In one enterprise with 70-80 DC's across multiple sites, it has taken more than two minutes to obtain the service ticket. However, since smbd starts obtaining the service ticket only after it has contacted the (on-site) DC and done SMB2 negotiation, the DC drops the connection after 60 seconds (an established TCP connection past the negotiate phase but no session-setup attempted). This fails the join even if the user is willing to wait the 2 minutes (which he might not be, since this all could be wrapped in a shiny REST API and a GUI).

On the other hand, if we make the process site-aware, we first find on-site DC using CLDAP - this could take a few sec because of the firewall, but no SMB connection is open at this stage.
Comment 1 Uri Simchoni 2016-03-08 20:41:59 UTC
Created attachment 11905 [details]
git-am fix for 4.4.0 and 4.3.next
Comment 2 Andreas Schneider 2016-03-10 15:37:24 UTC
Comment on attachment 11905 [details]
git-am fix for 4.4.0 and 4.3.next

LGTM
Comment 3 Uri Simchoni 2016-03-10 17:51:22 UTC
Assigning to Karolin for inclusion in 4.4.0 and 4.3.next
Comment 4 Karolin Seeger 2016-03-14 08:51:31 UTC
(In reply to Uri Simchoni from comment #3)
Pushed to autobuild-v4-[4|3]-test.
Comment 5 Karolin Seeger 2016-03-21 11:36:35 UTC
(In reply to Karolin Seeger from comment #4)
Pushed to both branches.
Closing out bug report.

Thanks!
Comment 6 Uri Simchoni 2016-06-18 17:15:51 UTC
(In reply to Karolin Seeger from comment #5)
Seems like the fix only made it to 4.4.x branch. This is consistent with the release notes.

4.3.x is now in maintenance so I'm not going to push for a fix there. FWIW the patch still applies cleanly to v4-3-stable at the time of this writing.
Comment 7 Karolin Seeger 2016-06-20 07:59:25 UTC
(In reply to Uri Simchoni from comment #6)
That's strange... Sorry!
Pushed to autobuild-v4-3-test.
Comment 8 Karolin Seeger 2016-06-22 11:05:06 UTC
(In reply to Karolin Seeger from comment #7)
Finally ended up in v4-3-test.
Closing out bug report.

Thanks!