Bug 13158 - The "10 hour problem"
Summary: The "10 hour problem"
Status: RESOLVED WORKSFORME
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.7.2
Hardware: x64 FreeBSD
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-21 17:34 UTC by Peter Eriksson
Modified: 2021-03-02 23:02 UTC (History)
3 users (show)

See Also:


Attachments
smb.conf file (2.86 KB, text/plain)
2017-11-21 17:34 UTC, Peter Eriksson
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2017-11-21 17:34:57 UTC
Created attachment 13800 [details]
smb.conf file

We are seeing refused/very long login times (more that 10 seconds, and sometime minutes/failures) on our fileservers for a minute or two at exactly every 10 hour since the smbd/winbindd daemons were last restarted.

(We used to restart our smbd/winbindd daemons at 4am every night, which caused problems and bug reports from our users at 2pm every day. Have since moved the restart to 7am so we at least can avoid having this problem during daytime).

My guess is that this issue is due to the 10 hour default lifetime for Kerberos service tickets and something happens/goes wrong/takes a long time when winbindd(?) tries to renew it?

Our setup: Samba 4.7.2 on FreeBSD 11.1 servers. Joined to a Windows 2012 AD domain (six AD servers).

What makes the problem even more fun to pinpoint is that we don't _always_ see it (or perhaps we just miss it - we have a login latency testing system running that tests how long a smbclient takes to login to our four samba servers - every minute.

Ideas? Anyone else see something like this?

One wild idea I have is that it is due to slow kerberos ticket propagation between the AD servers and that when we see the problem the klient and the Samba server happen to be bound to different AD servers.

Attaching our smb.conf file.
Comment 1 Björn Jacke 2020-03-15 21:32:30 UTC
the kerberos method system keytab might be related to your problem. In case you still hvae this problem, I think this is too complex to analyze here, this is why nothing happens here. I recommend to look for alternative support for example from one of the companies offering samba support: https://www.samba.org/samba/support/globalsupport.html
Comment 2 Peter Eriksson 2021-03-02 23:02:38 UTC
Just a quick update on this old problem in case someone else does a search and finds this:

For some unknown reason this issue that we've had since 2017 suddenly on Nov 6 2020 stopped occuring and hasn't reappeared again. 

We _think_ it might coincide with the AD guys upgrading their AD servers to Windows Server 2019 (and moved to new hardware), but a lot of stuff happend those days - servers moving to new locations, new network switches&routers... We didn't upgrade the Samba servers though (except for pointing to the new AD servers).

Anyway, it's gone now and I hope it stays away... :-)