Bug 8955 - NetrServerPasswordSet2 timeout is too short
NetrServerPasswordSet2 timeout is too short
Product: Samba 3.6
Classification: Unclassified
Component: Winbind
All All
: P5 normal
: ---
Assigned To: Karolin Seeger
Samba QA Contact
Depends on:
  Show dependency treegraph
Reported: 2012-05-24 12:22 UTC by Alan Ford
Modified: 2013-09-16 07:12 UTC (History)
3 users (show)

See Also:

Patches for v3-6-test (4.20 KB, patch)
2013-09-12 07:08 UTC, Stefan Metzmacher
metze: review? (vl)
ambi: review+

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Ford 2012-05-24 12:22:51 UTC
Scenario: using winbindd to manage a machine account against a Windows AD server.

In busy AD deployments, it can take over 20 seconds for an AD server to respond to the NetrServerPasswordSet2 netlogon message for changing the machine account password.

In winbindd, NT_STATUS_IO_TIMEOUT fires after exactly 10 seconds.

This ends up with an inconsistent state: winbindd does not update its local password, but the AD server has updated its copy. So future preauthentication attempts fail.

The simplest solution to this issue is to have a longer timeout (30 seconds?).

However, there is also a more complex and potentially more robust solution. Windows clients do not have this issue against the same AD server, despite the long NetrServerPasswordSet2 response. One reason for this is (I believe) that they update the local machine account password immediately, rather than waiting for the AD server's response. If the new password fails on the next preauthentication attempt, it reverts to using the previous password.

Given that winbindd stores both the current and past password, could it behave in this way instead? i.e. update its local password on sending the request, and try both when it is next required (falling back to the previous password if necessary).
Comment 1 Stefan Metzmacher 2012-05-24 13:24:27 UTC
Windows also uses the old password for an hour or so,
I think winbindd should do the same. This way it works
without having to care about replication latency.

In other places we use a timeout of 35 senconds,
I think we should also use that here.
Comment 2 Stefan Metzmacher 2013-05-10 20:22:42 UTC
This is fixed with commit b9a15f1bfad30a824f9ec87bc9f7c65adf50dae0
in 4.0 and master.
Comment 3 Christian Ambach 2013-05-10 21:08:41 UTC
A similar problem exists during net ads join. The timeout there also needs to be increased. I already have a patch for this, not sure if it already landed in master.
Comment 4 Christian Ambach 2013-05-10 21:11:05 UTC
See 9755541ed156d71df98607375ee3b925266c3c74 for the net ads join piece.
Comment 5 Stefan Metzmacher 2013-09-12 07:08:19 UTC
Created attachment 9205 [details]
Patches for v3-6-test
Comment 6 Karolin Seeger 2013-09-16 07:12:01 UTC
Pushed to v3-6-test.
Closing out bug report.