Bug 13921 - Win10 clients will not stay connected to samba shares for more than a day unless smb.conf contains 'smb encrypt = off
Summary: Win10 clients will not stay connected to samba shares for more than a day unl...
Status: RESOLVED DUPLICATE of bug 13624
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.8.3
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-30 17:10 UTC by Mason Schmitt
Modified: 2019-07-22 09:39 UTC (History)
3 users (show)

See Also:


Attachments
Packet capture - win10 client session timeout with samba 4.8.3 file server - smb encrypt = desired (10.14 KB, application/vnd.tcpdump.pcap)
2019-05-03 06:43 UTC, Mason Schmitt
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mason Schmitt 2019-04-30 17:10:20 UTC
Problem:
Each morning, windows 10 users are not able to access their mapped drives.  Once they reboot their computers, they are fine for another day.  Windows 7 users do not experience this issue. 

Configuration:
Samba AD DC, running on Ubuntu 18.04, using the stock samba package (4.7.6)
Samba file server, running on CentOS 7.6, using the stock samba package (4.8.3)


smb.conf on AD DC
--------------------------------

# Global parameters
[global]
        dns forwarder = 10.0.38.1
        netbios name = AD1
        realm = REALM.EXAMPLE.COM
        server role = active directory domain controller
        workgroup = REALM
        idmap_ldb:use rfc2307 = yes

[netlogon]
        path = /var/lib/samba/sysvol/realm.example.com/scripts
        read only = No

[sysvol]
        path = /var/lib/samba/sysvol
        read only = No


krb5.conf on AD DC
------------------------------
[libdefaults]
        default_realm =  REALM.EXAMPLE.COM
        dns_lookup_realm = false
        dns_lookup_kdc = true



smb.conf on file server
----------------------------------

[global]
kerberos method = system keytab
workgroup = REALM
security = ads
realm = REALM.EXAMPLE.COM

# Logging
log file = /var/log/samba/%m.log
log level = 3

idmap config REALM : range = 2000000-2999999
idmap config REALM : backend = rid
idmap config * : range = 10000-999999
idmap config * : backend = tdb

winbind use default domain = no
winbind refresh tickets = yes
winbind offline logon = yes
winbind enum groups = no
winbind enum users = no

username map = /etc/samba/user.map
bind interfaces only = yes
interfaces = lo eth0

vfs objects = acl_xattr
acl_xattr:default acl style = windows
map acl inherit = yes
store dos attributes = yes
template shell = /bin/false
disable netbios = yes
smb encrypt = desired # Problem exists if this is removed too.
access based share enum = yes
template homedir = /srv/samba/Users/%U
obey pam restrictions = yes

[Users]
        path = /srv/samba/Users
        comment = Share for user home dirs
        guest ok = no
        read only = no

[Shared]
       path = /srv/samba/Shared
       guest ok = no
       read only = no


krb5.conf on file server
------------------------------
[libdefaults]
        default_realm =  REALM.EXAMPLE.COM
        dns_lookup_realm = false
        dns_lookup_kdc = true


--------------------------------------------------------------

Supporting details


For a windows 10 client that is currently unable to access mapped drives, a packet capture shows:

PC -> FS - encrypted and signed SMB3 packet with SMB2 TRANSFORM_HEADER showing a session ID of 0x000000005bb17760
FS -> PC - plain text SMB2 packet with the same session ID as above, and an NT Status header that says STATUS_NETWORK_SESSION_EXPIRED (0xc000035c) 
During the 17 seconds of the packet capture, this fruitless exchange happened 18 times across 4 bursts.  I don’t know if the bursts corresponded to me clicking on the share in the Windows UI, but I expect they might.


What the protocol docs say

According to Microsoft, NT Status STATUS_NETWORK_SESSION_EXPIRED (0xc000035c) means: “The client session has expired; so the client must re-authenticate to continue accessing the remote resources.”

According to Microsoft’s SMB2 protocol documentation “If the Status field in the SMB2 header is STATUS_NETWORK_SESSION_EXPIRED, the client MUST attempt to reauthenticate the session that is identified by the SessionId in the SMB2 header, as specified in section 3.2.4.2.3. If the reauthentication attempt succeeds, the client MUST retry the request that failed with STATUS_NETWORK_SESSION_EXPIRED. If the reauthentication attempt fails, the client MUST fail the operation and terminate the session, as specified in section 3.2.4.23.”

Therefore, according to Microsoft’s own protocol documentation, if the re-auth attempt fails, the client MUST fail the operation and terminate the session…  So, why doesn’t the client give up???
From the quoted section above, we’re referred to section 3.2.4.23.   Essentially, I think that section says that the client should be sending a logoff message to the server with the SMB2 header populated with specific messages.

Even with SMB3 and encrypted payload, the SMB2 header still appears to be in plain text, so it doesn’t appear to me that the client is following the spec, because I don’t see any of the required headers in the SMB2 header.
Comment 1 Jeremy Allison 2019-04-30 17:51:05 UTC
Can you upload a wireshark trace showing the problematic timeframe ? If you do so please remember to use a 'burner' account with a throwaway password just in case there's any NTLM in there (you don't want to compromise any real users or passwords).

If we can find an issue with the Windows 10 client behavior we can raise it on your behalf with Microsoft.

Thanks !
Comment 2 Mason Schmitt 2019-05-03 06:43:11 UTC
Created attachment 15118 [details]
Packet capture - win10 client session timeout with samba 4.8.3 file server - smb encrypt = desired
Comment 3 Mason Schmitt 2019-05-06 17:20:41 UTC
(In reply to Jeremy Allison from comment #1)
Hi Jeremy,

I'm not sure if you saw the uploaded packet capture file, so I thought I would send you a note.
Comment 4 Rik Theys 2019-05-29 09:13:30 UTC
Hi,

We've enabled 'smb encrypt = desired' this weekend and are now experiencing the same issue on Windows 10 clients and Server 2016 (so far).

Except for a reboot of the client, another workaround is to kill the session on the samba server.

When the debug level is increased for a session, we see the STATUS_NETWORK_SESSION_EXPIRED messages going to the client.

Is it the encryption session that is expired (key used in AES encryption), or the SMB session on the server itself? Does the AES key lifetime have anything to do with kerberos ticket renewals?

Our samba server is running samba-4.8.3-4.el7.x86_64.

Regards,

Rik
Comment 5 Mason Schmitt 2019-05-29 17:03:29 UTC
(In reply to Rik Theys from comment #4)
Hi Rik,

I just noticed that the workaround didn't make its way into this bug report.  As discussed on the Samba mailing list, setting 'smb encrypt = off' works around the problem.

https://lists.samba.org/archive/samba/2019-April/222748.html
Comment 6 Rik Theys 2019-06-03 08:36:35 UTC
(In reply to Mason Schmitt from comment #5)

Hi Mason,

Instead of configuring 'smb encrypt = off', we reverted to the default value.

For clients that were already connected and using encryption it now seems to list:

partial(AES-128-CCM)

in the encryption column. For systems that have recently connected, the column indicates no encryption is in use.

I assume that the clients currently using partial encryption may revert to no encryption once they have been rebooted.

Regards,
Rik
Comment 7 Stefan Metzmacher 2019-07-22 09:31:58 UTC

*** This bug has been marked as a duplicate of bug 13624 ***
Comment 8 Stefan Metzmacher 2019-07-22 09:39:20 UTC
(In reply to Stefan Metzmacher from comment #7)

Bugs #9175, #13661 are also related.