Bug 5989 - cifsd hangs in uninterruptible sleep
Summary: cifsd hangs in uninterruptible sleep
Status: RESOLVED FIXED
Alias: None
Product: CifsVFS
Classification: Unclassified
Component: kernel fs (show other bugs)
Version: 2.6
Hardware: x86 Linux
: P3 normal
Target Milestone: ---
Assignee: Steve French
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-12-23 15:59 UTC by bugzilla.samba.org
Modified: 2009-02-18 23:28 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description bugzilla.samba.org 2008-12-23 15:59:53 UTC
After mounting an unmounting a remote cifs share on a windows xp host, the linux host keeps three cifs kernel processes in the process table:

$ ps aux
[...]
root     11351  0.0  0.0      0     0 ?        S<   22:30   0:00 [cifsoplockd]
root     11352  0.0  0.0      0     0 ?        S<   22:30   0:00 [cifsdnotifyd]
root     11357  0.7  0.0      0     0 ?        D<   22:30   0:00 [cifsd]
[...]

cifsd is in an uninterruptible sleep waiting for i/o (but what i/o?), without any open files:

$ lsof -p 11357
COMMAND   PID USER   FD      TYPE DEVICE SIZE/OFF NODE NAME
cifsd   11357 root  cwd       DIR   22,3      536    2 /
cifsd   11357 root  rtd       DIR   22,3      536    2 /
cifsd   11357 root  txt   unknown                      /proc/11357/exe

cifsd is not interruptible by any signals, of course. An attempt to kill it by

$ modprobe -r cifs

results in removing the kernel module (!) as well as the sleeping processes cifsoplockd and cifsdnotifyd, but puts this error message into the syslog:

Dec 23 22:36:04 lex kernel: slab error in kmem_cache_destroy(): cache `cifs_request': Can't free all
Dec 23 22:36:04 lex kernel: Pid: 11436, comm: modprobe Tainted: G          2.6.27.7-9-default #1
Dec 23 22:36:04 lex kernel:  [<c0106570>] dump_trace+0x6b/0x249
Dec 23 22:36:04 lex kernel:  [<c01070a5>] show_trace+0x20/0x39
Dec 23 22:36:04 lex kernel:  [<c0343c02>] dump_stack+0x71/0x76
Dec 23 22:36:04 lex kernel:  [<c018c376>] kmem_cache_destroy+0x88/0xd6
Dec 23 22:36:04 lex kernel:  [<d10dc84e>] cifs_destroy_request_bufs+0x14/0x28 [cifs]
Dec 23 22:36:04 lex kernel:  [<d10fc79c>] exit_cifs+0x3c/0xc8 [cifs]
Dec 23 22:36:04 lex kernel:  [<c014cbf9>] sys_delete_module+0x1ce/0x228
Dec 23 22:36:04 lex kernel:  [<c0104d92>] syscall_call+0x7/0xb
Dec 23 22:36:04 lex kernel:  [<b7ec7d24>] 0xb7ec7d24
Dec 23 22:36:04 lex kernel:  =======================
Dec 23 22:36:04 lex kernel: slab error in kmem_cache_destroy(): cache `cifs_small_rq': Can't free all
Dec 23 22:36:04 lex kernel: Pid: 11436, comm: modprobe Tainted: G          2.6.27.7-9-default #1
Dec 23 22:36:04 lex kernel:  [<c0106570>] dump_trace+0x6b/0x249
Dec 23 22:36:04 lex kernel:  [<c01070a5>] show_trace+0x20/0x39
Dec 23 22:36:04 lex kernel:  [<c0343c02>] dump_stack+0x71/0x76
Dec 23 22:36:04 lex kernel:  [<c018c376>] kmem_cache_destroy+0x88/0xd6
Dec 23 22:36:04 lex kernel:  [<d10fc79c>] exit_cifs+0x3c/0xc8 [cifs]
Dec 23 22:36:04 lex kernel:  [<c014cbf9>] sys_delete_module+0x1ce/0x228
Dec 23 22:36:04 lex kernel:  [<c0104d92>] syscall_call+0x7/0xb
Dec 23 22:36:04 lex kernel:  [<b7ec7d24>] 0xb7ec7d24
Dec 23 22:36:04 lex kernel:  =======================

Re-inserting the module fails:

Dec 23 22:43:42 lex kernel: kmem_cache_create: duplicate cache cifs_request
Dec 23 22:43:42 lex kernel: Pid: 11472, comm: modprobe Tainted: G          2.6.27.7-9-default #1
Dec 23 22:43:42 lex kernel:  [<c0106570>] dump_trace+0x6b/0x249
Dec 23 22:43:42 lex kernel:  [<c01070a5>] show_trace+0x20/0x39
Dec 23 22:43:42 lex kernel:  [<c0343c02>] dump_stack+0x71/0x76
Dec 23 22:43:42 lex kernel:  [<c018c4a0>] kmem_cache_create+0xdc/0x3c4
Dec 23 22:43:42 lex kernel:  [<d10dc8d0>] cifs_init_request_bufs+0x5c/0x191 [cifs]
Dec 23 22:43:42 lex kernel:  [<d107c35d>] init_cifs+0x35d/0x368 [cifs]
Dec 23 22:43:42 lex kernel:  [<c010112b>] _stext+0x3b/0x127
Dec 23 22:43:42 lex kernel:  [<c014c8bf>] sys_init_module+0x8a/0x19e
Dec 23 22:43:42 lex kernel:  [<c0104d92>] syscall_call+0x7/0xb
Dec 23 22:43:42 lex kernel:  [<b7e6ff0e>] 0xb7e6ff0e
Dec 23 22:43:42 lex kernel:  =======================
Dec 23 22:43:42 lex modprobe: FATAL: Error inserting cifs (/lib/modules/2.6.27.7-9-default/kernel/fs/
Dec 23 22:43:50 lex kernel: kmem_cache_create: duplicate cache cifs_request
Dec 23 22:43:50 lex kernel: Pid: 11478, comm: modprobe Tainted: G          2.6.27.7-9-default #1
Dec 23 22:43:50 lex kernel:  [<c0106570>] dump_trace+0x6b/0x249
Dec 23 22:43:50 lex kernel:  [<c01070a5>] show_trace+0x20/0x39
Dec 23 22:43:50 lex kernel:  [<c0343c02>] dump_stack+0x71/0x76
Dec 23 22:43:50 lex kernel:  [<c018c4a0>] kmem_cache_create+0xdc/0x3c4
Dec 23 22:43:50 lex kernel:  [<d10dc8d0>] cifs_init_request_bufs+0x5c/0x191 [cifs]
Dec 23 22:43:50 lex kernel:  [<d107c35d>] init_cifs+0x35d/0x368 [cifs]
Dec 23 22:43:50 lex kernel:  [<c010112b>] _stext+0x3b/0x127
Dec 23 22:43:50 lex kernel:  [<c014c8bf>] sys_init_module+0x8a/0x19e
Dec 23 22:43:50 lex kernel:  [<c0104d92>] syscall_call+0x7/0xb
Dec 23 22:43:50 lex kernel:  [<b7f54f0e>] 0xb7f54f0e
Dec 23 22:43:50 lex kernel:  =======================
Comment 1 bugzilla.samba.org 2008-12-23 16:01:11 UTC
Sorry, the first line should read: "After mounting _and_ unmounting ..."
Comment 2 bugzilla.samba.org 2009-01-03 06:51:20 UTC
I can reproduce the bug with a remote cifs share on a linux host (openSUSE 11.1, samba-3.2.6). It obviously does not depend on the server software.

Perhaps introduced by this patch: <http://lists.samba.org/archive/linux-cifs-client/2008-May/002919.html> ? I did not notify any uninterruptibly sleeping cifs processes before.
Comment 3 bugzilla.samba.org 2009-01-04 14:30:46 UTC
One more test today (sorry for dropping information bit by bit)...

The cifsd kernel thread is removed from the process list as expected after unmounting all shares when I run a different kernel.

Buggy: openSUSE 11.1 / i386 / kernel-default-2.6.27.7-9.1 RPM
Works: plain vanilla 2.6.28 kernel

I have reported this bug on openSUSE bugzilla <https://bugzilla.novell.com/show_bug.cgi?id=463465> now.
Comment 4 Steve French 2009-02-18 23:28:44 UTC
There was a large set of changes made soon after 2.6.27 (now in 2.6.27 stable kernels and later kernels) - changes to the cifs mount code to fix some refcounting issues which may have resolved this.