Bug 13158 - The "10 hour problem"
The "10 hour problem"
Status: NEW
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.7.2
x64 FreeBSD
: P5 normal
: ---
Assigned To: Samba QA Contact
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-11-21 17:34 UTC by Peter Eriksson
Modified: 2017-11-22 10:29 UTC (History)
3 users (show)

See Also:


Attachments
smb.conf file (2.86 KB, text/plain)
2017-11-21 17:34 UTC, Peter Eriksson
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2017-11-21 17:34:57 UTC
Created attachment 13800 [details]
smb.conf file

We are seeing refused/very long login times (more that 10 seconds, and sometime minutes/failures) on our fileservers for a minute or two at exactly every 10 hour since the smbd/winbindd daemons were last restarted.

(We used to restart our smbd/winbindd daemons at 4am every night, which caused problems and bug reports from our users at 2pm every day. Have since moved the restart to 7am so we at least can avoid having this problem during daytime).

My guess is that this issue is due to the 10 hour default lifetime for Kerberos service tickets and something happens/goes wrong/takes a long time when winbindd(?) tries to renew it?

Our setup: Samba 4.7.2 on FreeBSD 11.1 servers. Joined to a Windows 2012 AD domain (six AD servers).

What makes the problem even more fun to pinpoint is that we don't _always_ see it (or perhaps we just miss it - we have a login latency testing system running that tests how long a smbclient takes to login to our four samba servers - every minute.

Ideas? Anyone else see something like this?

One wild idea I have is that it is due to slow kerberos ticket propagation between the AD servers and that when we see the problem the klient and the Samba server happen to be bound to different AD servers.

Attaching our smb.conf file.