Bug 15553 - The client can get stuck in a state where it constantly reconnects about 5 times per second
Summary: The client can get stuck in a state where it constantly reconnects about 5 ti...
Status: NEW
Alias: None
Product: CifsVFS
Classification: Unclassified
Component: kernel fs (show other bugs)
Version: 5.x
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: Steve French
QA Contact: cifs QA contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-09 14:00 UTC by Ondrej Hubsch
Modified: 2024-01-09 14:30 UTC (History)
0 users

See Also:


Attachments
Wireshark screenshot (1.22 MB, image/png)
2024-01-09 14:00 UTC, Ondrej Hubsch
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ondrej Hubsch 2024-01-09 14:00:11 UTC
Created attachment 18224 [details]
Wireshark screenshot

We were doing some stress testing, where SMB server was configured as follows:
- artificial latency while processing SMB session setup, negotiate or treeconnect set to 150ms 
- artificial latency while processing each SMB request set to 300ms
- random probability of disconnecting the connection after responding to SMB request set to 0.1% pct, aka on average 1 out of 1000 SMB packets would lead to a disconnect after sending a response

Then we spawn roughly 128 workers where each one reads a random 20MB files stored on the server and then picks a different one and so on.

We mainly used 5.14.0-330 and 5.14.0-391 kernels from Centos Stream 9 and we suspect there's a similar issue in both of them, but it manifests slightly differently. The same issue as in 391 also occurs in 6.6.9 from elrepo (6.6.9-1.el9.elrepo.x86_64).

On 5.14.0-330 that does not contain https://github.com/torvalds/linux/commit/326a8d04f147e2bf393f6f9cdb74126ee6900607 credits sometimes go negative in the DebugDump and eventually the client gets to a state where only KeepAlives can be observed in Wireshark. Negotiate command seems stuck waiting for credits.

On both 5.14.0-391 and 6.6.9 we can see the client **constantly** reconnecting. Every second or so, the "Instance" field increases by roughly 5. Both the load tool and cifsiod processes seem stuck in D state. Wireshark shows the following pattern:
- SMB2 negotiate request (10 credits requested)
- SMB2 negotiate response (contains 10 granted credits)
- FIN sent to the server


Some fields in DebugDump oscilates slightly, highligted the oscilating fields. Credits seem permantently stuck on 0, 0, 0. 

Display Internal CIFS Data Structures for Debugging
---------------------------------------------------
CIFS Version 2.45
Features: DFS,SMB_DIRECT,STATS,DEBUG,ALLOW_INSECURE_LEGACY,CIFS_POSIX,UPCALL(SPNEGO),XATTR,ACL,WITNESS
CIFSMaxBufSize: 16384
Active VFS Requests: 0

Servers:
1) ConnectionId: 0x1 Hostname: gs63-10.gs63-10-ct1.dev.purestorage.com
ClientGUID: 012EDED1-49D5-D347-B4B4-92563CA1FB2E
Number of credits: 0,0,0 Dialect 0x311
TCP status: 5 (occasionally blinks to 4) Instance: 2740429 (keeps increasing)
Local Users To Server: 2 SecMode: 0x1 Req On Wire: 0 (blinks to -1 occasionally, immediately goes back to 0) Net namespace: 4026531840
In Send: 0 In MaxReq Wait: 69

        Sessions:
        1) Address: 10.14.63.102 Uses: 2 Capability: 0x300067   Session Status: 3 Guest
        Security type: RawNTLMSSP  SessionId: 0x2d2a9b180000086c
        User: 0 Cred User: 0    Primary channel: DISCONNECTED

        Shares:
        0) IPC: \\10.14.63.102\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0
        PathComponentMax: 0 Status: 3 type: 0 Serial Number: 0x0
        Share Capabilities: None        Share Flags: 0x30
        tid: 0x3        Maximal Access: 0x1f01ff        DISCONNECTED

        1) \\10.14.63.102\fs1 Mounts: 2 (oscilates between 1 and 2) DevInfo: 0x50 Attributes: 0xc0008f
        PathComponentMax: 1040 Status: 3 type: DISK Serial Number: 0xccb9f39e
        Share Capabilities: None Aligned, Partition Aligned,    Share Flags: 0x30
        tid: 0x2        Optimal sector size: 0x1000     Maximal Access: 0x1f01ff        DISCONNECTED


        Server interfaces: 3    Last updated: 4628375 seconds ago
        1)	Speed: 10Gbps
                Capabilities: None
                IPv4: 192.168.8.24
                Weight (cur,total): (0,10)
                Allocated channels: 0

        2)	Speed: 10Gbps
                Capabilities: None
                IPv4: 10.14.63.117
                Weight (cur,total): (0,10)
                Allocated channels: 0

        3)	Speed: 1Gbps
                Capabilities: None
                IPv4: 10.14.63.102
                Weight (cur,total): (0,1)
                Allocated channels: 0


        MIDs: -- this seems interesting/important. There is this "long standing one" 00000000aec8f49d that is there most of the time when doing watch -n 0.2 cat /proc/fs/cifs/DebugData, and it sometimes oscilates to a different one (a different random "cbdata", each time it changes to something else), and then switches back to this same one (00000000aec8f49d).
	Server ConnectionId: 0x1
                State: 2 com: 0 pid: 110433 cbdata: 00000000aec8f49d mid 0

Example oscilation (aec cbdata disappears at that time):
	Server ConnectionId: 0x1
                State: 2 com: 0 pid: 6040 cbdata: 000000009d66f66f mid 0
--

Witness registrations:

Stacktrace of 6040 (not at the time of blink), other processes are in similar paths:

[Tue Jan  9 05:56:04 2024] task:kworker/10:138  state:D stack:0     pid:6040  ppid:2      flags:0x00004000
[Tue Jan  9 05:56:04 2024] Workqueue: cifsiod cifs_uncached_readv_complete [cifs]
[Tue Jan  9 05:56:04 2024] Call Trace:
[Tue Jan  9 05:56:04 2024]  <TASK>
[Tue Jan  9 05:56:04 2024]  __schedule+0x21c/0x680
[Tue Jan  9 05:56:04 2024]  schedule+0x5d/0xd0
[Tue Jan  9 05:56:04 2024]  smb2_wait_mtu_credits+0xde/0x2d0 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __pfx_autoremove_wake_function+0x10/0x10
[Tue Jan  9 05:56:04 2024]  cifs_resend_rdata.constprop.0+0x81/0x270 [cifs]
[Tue Jan  9 05:56:04 2024]  collect_uncached_read_data+0x2d5/0x300 [cifs]
[Tue Jan  9 05:56:04 2024]  cifs_uncached_readv_complete+0x20/0x60 [cifs]
[Tue Jan  9 05:56:04 2024]  process_one_work+0x174/0x340
[Tue Jan  9 05:56:04 2024]  worker_thread+0x27e/0x390
[Tue Jan  9 05:56:04 2024]  ? __pfx_worker_thread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  kthread+0xee/0x120
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork+0x2d/0x50
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork_asm+0x1b/0x30
[Tue Jan  9 05:56:04 2024]  </TASK>


Some other process:
[Tue Jan  9 05:56:04 2024] task:kworker/8:68    state:D stack:0     pid:8673  ppid:2      flags:0x00004000
[Tue Jan  9 05:56:04 2024] Workqueue: cifsiod cifs_uncached_readv_complete [cifs]
[Tue Jan  9 05:56:04 2024] Call Trace:
[Tue Jan  9 05:56:04 2024]  <TASK>
[Tue Jan  9 05:56:04 2024]  __schedule+0x21c/0x680
[Tue Jan  9 05:56:04 2024]  schedule+0x5d/0xd0
[Tue Jan  9 05:56:04 2024]  schedule_timeout+0x88/0x160
[Tue Jan  9 05:56:04 2024]  ? __pfx_process_timeout+0x10/0x10
[Tue Jan  9 05:56:04 2024]  wait_for_free_credits+0x2f1/0x840 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __pfx_autoremove_wake_function+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ? __pfx_smb2_readv_callback+0x10/0x10 [cifs]
[Tue Jan  9 05:56:04 2024]  cifs_call_async+0x260/0x320 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __pfx_cifs_readv_receive+0x10/0x10 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __pfx_smb3_handle_read_data+0x10/0x10 [cifs]
[Tue Jan  9 05:56:04 2024]  smb2_async_readv+0x1c5/0x360 [cifs]
[Tue Jan  9 05:56:04 2024]  cifs_resend_rdata.constprop.0+0x17b/0x270 [cifs]
[Tue Jan  9 05:56:04 2024]  collect_uncached_read_data+0x2d5/0x300 [cifs]
[Tue Jan  9 05:56:04 2024]  cifs_uncached_readv_complete+0x20/0x60 [cifs]
[Tue Jan  9 05:56:04 2024]  process_one_work+0x174/0x340
[Tue Jan  9 05:56:04 2024]  worker_thread+0x27e/0x390
[Tue Jan  9 05:56:04 2024]  ? __pfx_worker_thread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  kthread+0xee/0x120
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork+0x2d/0x50
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork_asm+0x1b/0x30
[Tue Jan  9 05:56:04 2024]  </TASK>

Stacktrace of 110433:
[Tue Jan  9 05:56:04 2024] task:kworker/9:1     state:D stack:0     pid:110433 ppid:2      flags:0x00004000
[Tue Jan  9 05:56:04 2024] Workqueue: cifsiod smb2_reconnect_server [cifs]
[Tue Jan  9 05:56:04 2024] Call Trace:
[Tue Jan  9 05:56:04 2024]  <TASK>
[Tue Jan  9 05:56:04 2024]  __schedule+0x21c/0x680
[Tue Jan  9 05:56:04 2024]  schedule+0x5d/0xd0
[Tue Jan  9 05:56:04 2024]  wait_for_response+0x6b/0xb0 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __pfx_autoremove_wake_function+0x10/0x10
[Tue Jan  9 05:56:04 2024]  compound_send_recv+0x415/0xa80 [cifs]
[Tue Jan  9 05:56:04 2024]  ? cifs_strtoUTF16+0xd2/0x130 [cifs]
[Tue Jan  9 05:56:04 2024]  ? preempt_count_add+0x70/0xa0
[Tue Jan  9 05:56:04 2024]  cifs_send_recv+0x1f/0x30 [cifs]
[Tue Jan  9 05:56:04 2024]  SMB2_negotiate+0x253/0xab0 [cifs]
[Tue Jan  9 05:56:04 2024]  ? __try_to_del_timer_sync+0x62/0xa0
[Tue Jan  9 05:56:04 2024]  ? smb2_negotiate+0x44/0x60 [cifs]
[Tue Jan  9 05:56:04 2024]  ? _raw_spin_unlock+0x15/0x30
[Tue Jan  9 05:56:04 2024]  smb2_negotiate+0x44/0x60 [cifs]
[Tue Jan  9 05:56:04 2024]  cifs_negotiate_protocol+0x93/0x110 [cifs]
[Tue Jan  9 05:56:04 2024]  smb2_reconnect.part.0+0x237/0x4a0 [cifs]
[Tue Jan  9 05:56:04 2024]  smb2_reconnect_server+0x22a/0x650 [cifs]
[Tue Jan  9 05:56:04 2024]  process_one_work+0x174/0x340
[Tue Jan  9 05:56:04 2024]  worker_thread+0x27e/0x390
[Tue Jan  9 05:56:04 2024]  ? __pfx_worker_thread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  kthread+0xee/0x120
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork+0x2d/0x50
[Tue Jan  9 05:56:04 2024]  ? __pfx_kthread+0x10/0x10
[Tue Jan  9 05:56:04 2024]  ret_from_fork_asm+0x1b/0x30
Comment 1 Ondrej Hubsch 2024-01-09 14:30:13 UTC
The "long-standing cbdata" also seems to change. Don't know how often, but now I see there most of the time:

	Server ConnectionId: 0x1
                State: 2 com: 0 pid: 109094 cbdata: 000000008ea7071c mid 0

instead of 00000000aec8f49d.