I'm using a 2.6.18 (Fedora 5) kernel, modified to use the 1.50c CIFS code available on http://us1.samba.org/samba/Linux_CIFS_client.html (since FC5 included 1.45 of CIFS). While unmounting a windows share, I got the following: May 28 14:01:03 asteroids kernel: ----------- [cut here ] --------- [please bite here ] --------- May 28 14:01:03 asteroids kernel: Kernel BUG at include/linux/mm.h:304 May 28 14:01:03 asteroids kernel: invalid opcode: 0000 [1] SMP May 28 14:01:03 asteroids kernel: last sysfs file: /class/net/eth0/address May 28 14:01:03 asteroids kernel: CPU 1 May 28 14:01:03 asteroids kernel: Modules linked in: cifs(U) nls_utf8 ipv6 video sbs i2c_ec button battery asus_acpi ac lp parport_pc parport uhci_hc d ehci_hcd sg tg3 serio_raw i2c_i801 i2c_core ide_cd cdrom shpchp pcspkr dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ata_piix libata sd_mod scsi_mod May 28 14:01:03 asteroids kernel: Pid: 17599, comm: umount.cifs Tainted: PF 2.6.18-1.2258.fc5 #1 May 28 14:01:03 asteroids kernel: RIP: 0010:[<ffffffff8022de2e>] [<ffffffff8022de2e>] __free_pages+0x7/0x2b May 28 14:01:03 asteroids kernel: RSP: 0018:ffff81001cef5df0 EFLAGS: 00010246 May 28 14:01:03 asteroids kernel: RAX: 0000000000000000 RBX: ffff810021bef7d0 RCX: 0000000000000003 May 28 14:01:03 asteroids kernel: RDX: 0000000000711d80 RSI: 0000000000000001 RDI: ffff810001712d80 May 28 14:01:03 asteroids kernel: RBP: ffff81006ac89dc0 R08: ffff81001cef4000 R09: ffff81007ffbf080 May 28 14:01:03 asteroids kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010 May 28 14:01:03 asteroids kernel: R13: ffff810021bef7d0 R14: ffff81007d5bc8c0 R15: 0000000000000000 May 28 14:01:03 asteroids kernel: FS: 00002aaaaaab1200(0000) GS:ffff81007ffba9c0(0000) knlGS:0000000000000000 May 28 14:01:03 asteroids kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 28 14:01:03 asteroids kernel: CR2: 00002aaaae1b2000 CR3: 0000000019483000 CR4: 00000000000006e0 May 28 14:01:03 asteroids kernel: Process umount.cifs (pid: 17599, threadinfo ffff81001cef4000, task ffff81001e0337d0) May 28 14:01:03 asteroids kernel: Stack: ffffffff8025a1ae ffff810021bef7d0 ffffffff802997fe 0000000000000000 May 28 14:01:03 asteroids kernel: ffffffff882a449d ffff81006ac89dc0 ffff810041351800 ffff8100413518a0 May 28 14:01:03 asteroids kernel: 0000000000000000 00007fffe0356b25 ffffffff882983be 0000000000000000 May 28 14:01:03 asteroids kernel: Call Trace: May 28 14:01:03 asteroids kernel: [<ffffffff8025a1ae>] free_task+0x12/0x22 May 28 14:01:03 asteroids kernel: [<ffffffff802997fe>] kthread_stop+0x4c/0x79 May 28 14:01:03 asteroids kernel: [<ffffffff882a449d>] :cifs:cifs_umount+0x141/0x213 May 28 14:01:03 asteroids kernel: [<ffffffff882983be>] :cifs:cifs_put_super+0x51/0x86 May 28 14:01:03 asteroids kernel: [<ffffffff802d0481>] generic_shutdown_super+0x79/0xfd May 28 14:01:03 asteroids kernel: [<ffffffff802d0548>] kill_anon_super+0x9/0x36 May 28 14:01:03 asteroids kernel: [<ffffffff802d05fc>] deactivate_super+0x6c/0x84 May 28 14:01:03 asteroids kernel: [<ffffffff802d9059>] sys_umount+0x246/0x28a May 28 14:01:03 asteroids kernel: [<ffffffff8025c74e>] system_call+0x7e/0x83 May 28 14:01:03 asteroids kernel: DWARF2 unwinder stuck at system_call+0x7e/0x83 May 28 14:01:03 asteroids kernel: Leftover inexact backtrace: May 28 14:01:03 asteroids kernel: May 28 14:01:03 asteroids kernel: May 28 14:01:03 asteroids kernel: Code: 0f 0b 68 bb c2 47 80 c2 30 01 f0 ff 4f 08 0f 94 c0 84 c0 74 May 28 14:01:03 asteroids kernel: RIP [<ffffffff8022de2e>] __free_pages+0x7/0x2b May 28 14:01:03 asteroids kernel: RSP <ffff81001cef5df0> May 28 14:01:03 asteroids kernel: BUG: warning at kernel/exit.c:852/do_exit() (Tainted: PF ) May 28 14:01:03 asteroids kernel: May 28 14:01:03 asteroids kernel: Call Trace: May 28 14:01:03 asteroids kernel: [<ffffffff802698ed>] show_trace+0x34/0x47 May 28 14:01:03 asteroids kernel: [<ffffffff80269912>] dump_stack+0x12/0x17 May 28 14:01:03 asteroids kernel: [<ffffffff80214e8a>] do_exit+0x58/0x927 May 28 14:01:03 asteroids kernel: [<ffffffff80269c05>] kernel_math_error+0x0/0x90 May 28 14:01:03 asteroids kernel: ---------------------------------------------------------------------- I noticed a similar problem in cifs_mount that was fixed in commit 28356a1679006b110215596e057f304ef3083922 (Fix oops on failed cifs mount, in kthread_stop). So, I updated cifs_umount, and the problem seems to have went away. I noticed that this is not in the latest version on git.kernel.org. Is there a better fix for this issue? --- a/connect.c 2007-09-20 15:46:02.000000000 -0400 +++ b/connect.c 2008-05-28 17:16:33.000000000 -0400 @@ -3588,7 +3588,8 @@ cifs_umount(struct super_block *sb, stru cFYI(1, ("Waking up socket by sending signal")); if (cifsd_task) { force_sig(SIGKILL, cifsd_task); - kthread_stop(cifsd_task); + if (ses->server->tsk) + kthread_stop(ses->server->tsk); } rc = 0; } /* else - we have an smb session
I suspect that this is now fixed in current kernels since cifsd now waits for kthread_stop before exiting. Please let us know if you can reproduce this on something more recent.
Ahh as to your question -- yes, there is a better fix in place. While the patch you have there helps, it's still a bit racy. It's possible to check the tsk var there and then have the thread exit on another CPU before we call kthread_stop on it. A better fix is in place now. cifsd now goes to sleep until kthread_stop is called so we don't need to check that the tsk var is non-NULL. I think we can probably close this case.
Closing as FIXED, please reopen if it isn't...