Bug 15123 - getxattr() on cifs sometimes hangs since kernel 5.14
Summary: getxattr() on cifs sometimes hangs since kernel 5.14
Status: NEW
Alias: None
Product: CifsVFS
Classification: Unclassified
Component: kernel fs (show other bugs)
Version: 5.x
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: Steve French
QA Contact: cifs QA contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-15 21:17 UTC by Forest
Modified: 2022-07-15 21:23 UTC (History)
0 users

See Also:


Attachments
possible reproducer (4.89 KB, text/x-csrc)
2022-07-15 21:17 UTC, Forest
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Forest 2022-07-15 21:17:10 UTC
Created attachment 17423 [details]
possible reproducer

When running on recent kernel versions, this system call on a cifs-mounted
file sometimes takes an unusually long time:

getxattr("/cifsmount/dir/image.jpg", "user.baloo.rating", NULL, 0)

The call normally returns in under 10 milliseconds, but on kernel 5.14+, it
sometimes takes over 30 seconds with no significant client or server load.

Discovered while using gwenview to browse 100+ 1.5 MiB images on a samba share
mounted via /etc/fstab. While quickly flipping through the images, the problem
often occurs within 20 seconds. Gwenview freezes until the call completes.

Client:
  kernel versions 5.14 and later
  mount.cifs 6.11
  Gwenview 20.12.3
  Debian Bullseye
  4-core amd64
Server:
  Samba 4.13.13-Debian
  Debian Bullseye
  6-core arm64 

A git bisect identified kernel commit 9e992755be8f as the problematic change.
The problem does not occur when any of the following are true:
- Client is running a kernel from before that commit.
- The nouser_xattr mount option is used on the cifs share.
- Gwenview accesses the files via smb:// URL instead of a cifs mount.

I don't know Gwenview's internals, but using its strace output as a guide, I
have written a potential reproducer. It succeeds at triggering slow getxattr()
calls, though not nearly as slow as those triggered by Gwenview. I will attach it to this report.

Originally reported in May 2022 on the linux-cifs mailing list:
https://www.spinics.net/lists/linux-cifs/msg25316.html
Comment 1 Forest 2022-07-15 21:23:35 UTC
I can reliably reproduce it booting from this live USB image:
kubuntu-22.04-desktop-amd64.iso

My setup steps when using that live USB environment:

sudo mount.cifs //myserver/dir ~/mnt/dir -o username=myuser,uid=999,gid=999
touch ~/mnt/dir/test{1..5}

Tests and results:
(My reproducer program is called xattrtest in these command lines.)

# 1 thread is fast
# (note the system call duration reported by strace in <> brackets)
$ time ./xattrtest ~/mnt/dir/test{1..5}
real    0m0.079s
user    0m0.003s
sys     0m0.000s
$ strace -Te getxattr ./xattrtest ~/mnt/dir/test1
getxattr("/home/myuser/mnt/dir/test1", "user.baloo.rating", NULL, 0) = -1
ENODATA (No data available) <0.008482>

# 2+ threads are slow
$ time ./xattrtest -t 2 ~/mnt/dir/test{1..5}
real    0m5.118s
user    0m0.005s
sys     0m0.000s
$ strace -Te getxattr ./xattrtest -t 2 ~/mnt/dir/test1
getxattr("/home/myuser/mnt/dir/test1", "user.baloo.rating", NULL, 0) = -1
ENODATA (No data available) <1.018507>

# 2+ threads are fast if I remount with the nouser_xattr option
$ time ./xattrtest -t 2 ~/mnt/dir/test{1..5}
real    0m0.061s
user    0m0.002s
sys     0m0.000s
$ strace -Te getxattr ./xattrtest -t 2 ~/mnt/dir/test1
getxattr("/home/myuser/mnt/dir/test1", "user.baloo.rating", NULL, 0) = -1
EOPNOTSUPP (Operation not supported) <0.000048>