5720 – Concurrent mount/umount processes to same windows machine, different shares hangs umount processes or crashes kernel

Bug 5720 - Concurrent mount/umount processes to same windows machine, different shares hangs umount processes or crashes kernel

Summary: Concurrent mount/umount processes to same windows machine, different shares h...

Status:	RESOLVED FIXED

Alias:	None

Product:	CifsVFS
Classification:	Unclassified
Component:	kernel fs (show other bugs)
Version:	2.6
Hardware:	x86 Linux

Importance:	P3 normal
Target Milestone:	---
Assignee:	Steve French
QA Contact:

URL:
Keywords:

Depends on:
Blocks:

Reported:	2008-08-26 05:05 UTC by Esben Jannik Bjerrum
Modified:	2008-11-23 07:37 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
Kernel Panic messages (14.74 KB, image/png) 2008-08-27 01:59 UTC, Esben Jannik Bjerrum	no flags	Details
sysrq-t info from mount/umount hang (113.23 KB, text/plain) 2008-08-27 05:41 UTC, Jeff Layton	no flags	Details
patch -- eliminate usage of kthread_stop to bring down cifsd (6.14 KB, patch) 2008-09-06 11:15 UTC, Jeff Layton	no flags	Details
patch -- eliminate races between simultaneous mount and unmount (30.74 KB, patch) 2008-10-14 15:17 UTC, Jeff Layton	no flags	Details
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Esben Jannik Bjerrum 2008-08-26 05:05:06 UTC

Running scripts to mimick the behaviour of our server (but speeded up), results in either hanging umount processes or crashing the kernel.

Setup:
1 windows machine, 2 shares (SAMBARAPE3 and SAMBARAPE4).
1 Linux machine mounting cifs shares.

Running the scripts:

while true ; do
mount //win2000machine/SAMBARAPE3 /mnt/SAMBA3/ -o username=XXX,password=YYY && echo 2
ls /mnt/SAMBA3
umount -f /mnt/SAMBA3 && echo -2
ls /mnt/SAMBA3
done

and 

Running scripts to mimick the behaviour of our server (but speeded up), results in either hanging umount processes or crashing the kernel.

Setup:
1 windows machine, 2 shares (SAMBARAPE3 and SAMBARAPE4).
1 Linux machine mounting cifs shares.

Running the scripts:

while true ; do
mount //win2000machine/SAMBARAPE4 /mnt/SAMBA4/ -o username=XXX,password=YYY && echo 2
ls /mnt/SAMBA4
umount -f /mnt/SAMBA4 && echo -2
ls /mnt/SAMBA4
done

and

while true ; do
mount //win2000machine/SAMBARAPE3 /mnt/SAMBA3/ -o username=XXX,password=YYY && echo 1
ls /mnt/SAMBA3
umount -f /mnt/SAMBA3 && echo -1
ls /mnt/SAMBA3
done

Concurrently either crashes the kernel or result in hanging umount processes.

In the last instance the kernel module unloading gets buggy. The echo statements and ls commands can be omitted.
Playing aroung with options, -i and -f works better (takes longer time to crash, but do not prevent the problem).

This is extreme usage of mount/umount processes, and we have minimised the problem by reducing the risk of these happening at the same time. It may be of interest for the SAMBA team (and maybe Kernel devs) to investigate anyway.

It happens on RHEL 5 and Ubuntu Hardy Heron and probably other distro's

Best Regards
Esben Bjerrum

Comment 1 Jeff Layton 2008-08-26 12:16:28 UTC

Exactly what RHEL5 kernel were you using when you saw this message?

Do you happen to have the oops message from the crash?

Comment 2 Steve French 2008-08-26 12:33:00 UTC

Slightly different fix - but close

diff --git a/source/client/mount.cifs.c b/source/client/mount.cifs.c
index dd878aa..9d2b449 100644
--- a/source/client/mount.cifs.c
+++ b/source/client/mount.cifs.c
@@ -196,7 +196,7 @@ static int open_cred_file(char * file_name)
        line_buf = (char *)malloc(4096);
        if(line_buf == NULL) {
                fclose(fs);
-               return -ENOMEM;
+               return ENOMEM;
        }
 
        while(fgets(line_buf,4096,fs)) {
@@ -537,7 +537,8 @@ static int parse_options(char ** optionsp, int * filesys_flags)
                        if (value && *value) {
                                rc = open_cred_file(value);
                                if(rc) {
-                                       printf("error %d opening credential file %s\n",rc, value);
+                                       printf("error %d (%s) opening credential file %s\n",
+                                               rc, strerror(rc), value);
                                        return 1;
                                }
                        } else {

Comment 3 Steve French 2008-08-26 12:34:02 UTC

Ignore previous post - made to wrong bug

Comment 4 Esben Jannik Bjerrum 2008-08-27 01:48:00 UTC

(In reply to comment #1)
> Exactly what RHEL5 kernel were you using when you saw this message?
> Do you happen to have the oops message from the crash?

2.6.18-92.1.1.el5
I also tried
2.6.22.14-72 from fc6
and
2.6.24-19-generic on Ubuntu Hardy

All show the problem

No oops as I'm aware of. Do it get printed to a specific virtual console?

Comment 5 Esben Jannik Bjerrum 2008-08-27 01:59:51 UTC

Created attachment 3508 [details]
Kernel Panic messages

Kernel panic message (no mount options)

Comment 6 Jeff Layton 2008-08-27 05:41:50 UTC

Created attachment 3510 [details]
sysrq-t info from mount/umount hang

I've been able to reproduce the hang (at least). This dmesg output contains some sysrq-t info. Looks like the mount and umount processes are stuck in a kthread_stop, but it's not clear to me why cifsd hasn't come down.

Comment 7 Jeff Layton 2008-09-06 11:15:45 UTC

Created attachment 3531 [details]
patch -- eliminate usage of kthread_stop to bring down cifsd

Possible patch -- only lightly tested. Just posted to linux-cifs-client mailing list and awaiting comment.

Comment 8 Jeff Layton 2008-10-06 11:53:18 UTC

The patch that fixes the deadlock here exposes some very nasty races in how data structures are shared between mounts from the same server, as well as some races in the cifs_reconnect logic. There are many problems:

1) in cifs_mount, the code that shares server, session and tcon generally walks a list to get pointers to structures that match the needs of the mount. The problem is that the code that does this doesn't take a reference to these structures soon enough, so it's easily possible for them to be freed before we can take a reference to them. These oopses are pretty easy to reproduce by running the mount/unmount shell scripts in the problem description on a kernel with the patch in comment #7.

2) The cifs_reconnect and handling of server->tcpStatus is racy. I have a patch that tries to reintroduce the sharing of sockets/cifsd's/server structs. It mostly worked, but I still see some failed mounts and got the oops below:

BUG: unable to handle kernel <3> CIFS VFS: No response for cmd 114 mid 2
 CIFS VFS: cifs_mount failed w/return code = -88
 CIFS VFS: Send error in SessSetup = -88 (tcpStatus = 1)
 CIFS VFS: cifs_mount failed w/return code = -88
NULL pointer dereference at 00000000000004a8
IP: [<ffffffff811579c9>] socket_has_perm+0xd/0x60
PGD 1786d067 PUD 17946067 PMD 1610c067 PTE 0
Oops: 0000 [1] SMP 
CPU 1 
Modules linked in: cifs nls_utf8 sco bridge stp bnep l2cap bluetooth autofs4 sunrpc ipv6 xfs dm_multipath floppy pcspkr 8139cp 8139too mii i2c_piix4 i2c_core pata_acpi ata_generic [last unloaded: cifs]
Pid: 15266, comm: cifsd Not tainted 2.6.27-0.391.rc8.git7.fc10.x86_64.debug #1
RIP: 0010:[<ffffffff811579c9>]  [<ffffffff811579c9>] socket_has_perm+0xd/0x60
RSP: 0018:ffff88000b421bd0  EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff880013d422d0
RBP: ffff88000b421c30 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000004 R11: ffff88000dd11198 R12: 0000000000000000
R13: 0000000000000004 R14: ffff88000b421ea0 R15: ffff88000b421ca0
FS:  0000000000000000(0000) GS:ffff88001f8047d0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000004a8 CR3: 0000000016116000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cifsd (pid: 15266, threadinfo ffff88000b420000, task ffff880013d422d0)
Stack:  ffff88000b421c10 ffffffff810674fe ffff88000b421c00 ffff880013d422d0
 ffff88001f950c68 0000000000000002 ffff880013d422d0 ffff88000b421c20
 ffffffff810c8fac ffffe2000043c8c0 5aff88000b421c70 0000000000000000
Call Trace:
 [<ffffffff810674fe>] ? mark_lock+0x22/0x3a2
 [<ffffffff810c8fac>] ? compound_order+0x1a/0x2b
 [<ffffffff81157ace>] selinux_socket_recvmsg+0x22/0x24
 [<ffffffff811536c5>] security_socket_recvmsg+0x16/0x18
 [<ffffffff812c8530>] __sock_recvmsg+0x54/0x7f
 [<ffffffff812c8c2d>] sock_recvmsg+0xcf/0xe8
 [<ffffffff810674fe>] ? mark_lock+0x22/0x3a2
 [<ffffffff8105a241>] ? autoremove_wake_function+0x0/0x3d
 [<ffffffff81066b21>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff81017e89>] ? native_sched_clock+0x8e/0xa8
 [<ffffffff81017d26>] ? sched_clock+0x9/0xc
 [<ffffffff81066009>] ? lock_release_holdtime+0x2c/0x111
 [<ffffffff81067ad4>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81036295>] ? __wake_up+0x43/0x4f
 [<ffffffff812c8c8a>] kernel_recvmsg+0x44/0x59
 [<ffffffffa02aa73a>] cifs_demultiplex_thread+0x1d0/0xc25 [cifs]
 [<ffffffffa02aa56a>] ? cifs_demultiplex_thread+0x0/0xc25 [cifs]
 [<ffffffff81059edb>] kthread+0x4e/0x7b
 [<ffffffff810128f9>] child_rip+0xa/0x11
 [<ffffffff81011c0e>] ? restore_args+0x0/0x30
 [<ffffffff81059e68>] ? kthreadd+0x17b/0x1a0
 [<ffffffff81059e8d>] ? kthread+0x0/0x7b
 [<ffffffff810128ef>] ? child_rip+0x0/0x11


Code: 45 e0 e8 7f 4c f7 ff eb 06 41 bc a4 ff ff ff 5b 44 89 e0 41 5c 5b 41 5c 41 5d 41 5e c9 c3 55 48 89 e5 48 83 ec 60 0f 1f 44 00 00 <4c> 8b 96 a8 04 00 00 31 c0 4c 8b 87 18 06 00 00 41 89 d1 41 83 
RIP  [<ffffffff811579c9>] socket_has_perm+0xd/0x60
 RSP <ffff88000b421bd0>
CR2: 00000000000004a8
---[ end trace 5a038d2cae9f6247 ]---


...the problem here, I think is that we're trying to do a kernel_sendmsg while ipv4_connect is being rerun. We obviously need some sort of locking for that as well.

I plan to post a set of patches soon that disables the sharing of the data structures between mounts for now. It'll mean a bit more memory usage but should be a lot safer until we can reintroduce the concept in a race-free way.

Comment 9 Jeff Layton 2008-10-08 14:10:16 UTC

I've been unable to reproduce the oops in comment #8. cifs_reconnect is only called from cifs_demultiplex_thread, so it shouldn't be possible to race in the way that I mentioned. I have to wonder if maybe the bug was something else. I'm planning to let the reproducer run overnight on my latest patchset to be sure.

If it survives until tomorrow, I'll chalk it up to something else.

Comment 10 Jeff Layton 2008-10-14 15:17:27 UTC

Created attachment 3676 [details]
patch -- eliminate races between simultaneous mount and unmount

A more comprehensive patchset that seems to fix all of the problems I've seen. With this, tcp sockets are still shared between mounts, but smb sessions and tree connects are not. This shouldn't be a problem as best I can tell.

It would be nice to reenable sharing of those structures, but I think we need to proceed deliberately and make sure that these races don't reoccur when we do so.

I've sent this to the linux-cifs-client mailing list with this subject:

[PATCH 0/4] cifs: fix deadlocks, oopses and mem corruption with concurrent mount/umount (try #2)

Comment 11 Jeff Layton 2008-11-02 18:52:10 UTC

I've posted another patchset to the mailing list that fixes this, but it's pretty large. Steve is attempting to roll a new patchset that fixes this while minimizes the changes.

Comment 12 Jeff Layton 2008-11-23 07:37:49 UTC

This should now be resolved in 2.6.28. I believe it's also being backported to stable series. Please reopen bug if that isn't the case...