Bug 4819 - 2.6.22.1: OOps during umount.
2.6.22.1: OOps during umount.
Status: RESOLVED FIXED
Product: CifsVFS
Classification: Unclassified
Component: kernel fs
2.6
x86 Linux
: P3 normal
: ---
Assigned To: Steve French
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-26 02:19 UTC by Christian Volkmann
Modified: 2009-05-14 17:37 UTC (History)
0 users

See Also:


Attachments
config active during the OOps. (17.62 KB, application/octet-stream)
2007-07-26 02:22 UTC, Christian Volkmann
no flags Details
All my collected OOps, some of them with a tainted kernel. (101.24 KB, text/plain)
2007-07-26 04:24 UTC, Christian Volkmann
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Volkmann 2007-07-26 02:19:49 UTC
I got this oops with "umount -a -t cifs". The umount is called by 
/etc/init.d/smbfs ( openSUSE 10.2 with kernel 2.6.22.1 ) during the shutdown.

More details will follow below.

Jul 24 09:48:57 ocv kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 000000c8
Jul 24 09:48:57 ocv kernel:  printing eip:
Jul 24 09:48:57 ocv kernel: f94a3a45
Jul 24 09:48:57 ocv kernel: *pde = 00000000
Jul 24 09:48:57 ocv kernel: Oops: 0000 [#1]
Jul 24 09:48:57 ocv kernel: SMP
Jul 24 09:48:57 ocv kernel: Modules linked in: joydev st i915 drm nfsd exportfs lockd ipv6 nfs_acl cifs sunrpc button battery ac cbc blkcipher twofish twofish_common cryptoloop usbhid ff_memless ohci_hcd sr_mod nls_utf8 ext2 loop dm_mod fuse usb_storage tg3 ac97_bus ide_cd soundcore snd_page_alloc
cdrom ehci_hcd uhci_hcd intel_agp usbcore agpgart shpchp pci_hotplug ext3 mbcache jbd edd fan sg ata_piix libata piix thermal processor sd_mod scsi_mod ide_disk ide_core
Jul 24 09:48:57 ocv kernel: CPU:    0
Jul 24 09:48:57 ocv kernel: EIP:    0060:[<f94a3a45>]    Not tainted VLI
Jul 24 09:48:57 ocv kernel: EFLAGS: 00010202   (2.6.22.1-cv #3)
Jul 24 09:48:57 ocv kernel: EIP is at smb_init+0x145/0x2a2 [cifs]
Jul 24 09:48:57 ocv kernel: eax: 00000034   ebx: dfdcdc00   ecx: dfe6ee00   edx: 0000000f
Jul 24 09:48:57 ocv kernel: esi: dfe6ee00   edi: f3043e54   ebp: 00000032   esp: f3043dfc
Jul 24 09:48:58 ocv kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Jul 24 09:48:58 ocv kernel: Process umount.cifs (pid: 1521, ti=f3042000 task=e8c2bab0 task.ti=f3042000)
Jul 24 09:48:58 ocv kernel: Stack: f70cac08 dfcc38c0 0000000f f70cac08 f75af68c c017526a f71c701b f7ce154c
Jul 24 09:48:58 ocv kernel:        dfdcdc00 f3043e58 f3043e54 dfe6ee00 f94a4258 f3043e58 f3043e54 00000000
Jul 24 09:48:58 ocv kernel:        00000246 51e38bf5 0000000b f3043ea4 00007d84 00000000 00000000 00000000
Jul 24 09:48:58 ocv kernel: Call Trace:
Jul 24 09:48:58 ocv kernel:  [<c017526a>] __link_path_walk+0xaf2/0xc1a
Jul 24 09:48:58 ocv kernel:  [<f94a4258>] SMBOldQFSInfo+0x5e/0x26c [cifs]
Jul 24 09:48:58 ocv kernel:  [<f94a3407>] cifs_statfs+0xc5/0xfd [cifs]
Jul 24 09:48:58 ocv kernel:  [<c02b452b>] do_page_fault+0x273/0x516
Jul 24 09:48:58 ocv kernel:  [<c016b75f>] vfs_statfs+0x47/0x5f
Jul 24 09:48:58 ocv kernel:  [<c016b851>] vfs_statfs64+0x10/0x21
Jul 24 09:48:58 ocv kernel:  [<c016c695>] sys_statfs64+0x49/0x80
Jul 24 09:48:58 ocv kernel:  [<c02b452b>] do_page_fault+0x273/0x516
Jul 24 09:48:58 ocv kernel:  [<c015f804>] do_munmap+0x193/0x1ac
Jul 24 09:48:58 ocv kernel:  [<c0104cb2>] sysenter_past_esp+0x5f/0x85
Jul 24 09:48:58 ocv kernel:  =======================
Jul 24 09:48:58 ocv kernel: Code: 75 23 bb 90 ff ff ff f6 05 00 74 4d f9 01 0f 84 6a 01 00 00 c7 04 24 e1 49 4c f9 e8 a0 05 c8 c6 e9 59 01 00 00 8b 46 24 8b 40 1c <83> b8 94 00 00 00 03 0f 84 28 ff ff ff e8 86 af d0 c6 89 c7 8b
Jul 24 09:48:58 ocv kernel: EIP: [<f94a3a45>] smb_init+0x145/0x2a2 [cifs] SS:ESP 0068:f3043dfc
Jul 24 09:48:58 ocv kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 000000c8
Jul 24 09:48:58 ocv kernel:  printing eip:
Jul 24 09:48:58 ocv kernel: f94a3a45
Jul 24 09:48:58 ocv kernel: *pde = 00000000
Jul 24 09:48:58 ocv kernel: Oops: 0000 [#2]
Jul 24 09:48:58 ocv kernel: SMP
Jul 24 09:48:58 ocv kernel: Modules linked in: joydev st i915 drm nfsd exportfs lockd ipv6 nfs_acl cifs sunrpc button battery ac cbc blkcipher twofish twofish_common cryptoloop usbhid ff_memless ohci_hcd sr_mod nls_utf8 ext2 loop dm_mod fuse usb_storage tg3 ac97_bus ide_cd soundcore snd_page_alloc
cdrom ehci_hcd uhci_hcd intel_agp usbcore agpgart shpchp pci_hotplug ext3 mbcache jbd edd fan sg ata_piix libata piix thermal processor sd_mod scsi_mod ide_disk ide_core
Jul 24 09:48:58 ocv kernel: CPU:    0
Jul 24 09:48:58 ocv kernel: EIP:    0060:[<f94a3a45>]    Not tainted VLI
Jul 24 09:48:58 ocv kernel: EFLAGS: 00010202   (2.6.22.1-cv #3)
Jul 24 09:48:58 ocv kernel: EIP is at smb_init+0x145/0x2a2 [cifs]
Jul 24 09:48:58 ocv kernel: eax: 00000034   ebx: dfdcdc00   ecx: dfe6ee00   edx: 0000000f
Jul 24 09:48:58 ocv kernel: esi: dfe6ee00   edi: ea413e54   ebp: 00000032   esp: ea413dfc
Jul 24 09:48:58 ocv kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Jul 24 09:48:58 ocv kernel: Process umount.cifs (pid: 1523, ti=ea412000 task=c60ff570 task.ti=ea412000)
Jul 24 09:48:58 ocv kernel: Stack: f70cac08 dfcc38c0 0000000f f70cac08 f75af68c c017526a dc25801b c6108a28
Jul 24 09:48:58 ocv kernel:        dfdcdc00 ea413e58 ea413e54 dfe6ee00 f94a4258 ea413e58 ea413e54 00000000
Jul 24 09:48:58 ocv kernel:        c6108a28 51e38bf5 0000000b ea413ea4 00007d85 00000000 00000000 00000000
Jul 24 09:48:58 ocv kernel: Call Trace:
Jul 24 09:48:58 ocv kernel:  [<c017526a>] __link_path_walk+0xaf2/0xc1a
Jul 24 09:48:58 ocv kernel:  [<f94a4258>] SMBOldQFSInfo+0x5e/0x26c [cifs]
Jul 24 09:48:58 ocv kernel:  [<f94a3407>] cifs_statfs+0xc5/0xfd [cifs]
Jul 24 09:48:58 ocv kernel:  [<c02b452b>] do_page_fault+0x273/0x516
Jul 24 09:48:58 ocv kernel:  [<c016b75f>] vfs_statfs+0x47/0x5f
Jul 24 09:48:58 ocv kernel:  [<c016b851>] vfs_statfs64+0x10/0x21
Jul 24 09:48:58 ocv kernel:  [<c016c695>] sys_statfs64+0x49/0x80
Jul 24 09:48:58 ocv kernel:  [<f8e3c1b7>] ext3_release_file+0x0/0x5d [ext3]
Jul 24 09:48:58 ocv kernel:  [<f8e3c1f4>] ext3_release_file+0x3d/0x5d [ext3]
Jul 24 09:48:58 ocv kernel:  [<c02b452b>] do_page_fault+0x273/0x516
Jul 24 09:48:58 ocv kernel:  [<c015f804>] do_munmap+0x193/0x1ac
Jul 24 09:48:58 ocv kernel:  [<c0104cb2>] sysenter_past_esp+0x5f/0x85
Jul 24 09:48:58 ocv kernel:  =======================
Jul 24 09:48:58 ocv kernel: Code: 75 23 bb 90 ff ff ff f6 05 00 74 4d f9 01 0f 84 6a 01 00 00 c7 04 24 e1 49 4c f9 e8 a0 05 c8 c6 e9 59 01 00 00 8b 46 24 8b 40 1c <83> b8 94 00 00 00 03 0f 84 28 ff ff ff e8 86 af d0 c6 89 c7 8b
Jul 24 09:48:58 ocv kernel: EIP: [<f94a3a45>] smb_init+0x145/0x2a2 [cifs] SS:ESP 0068:ea413dfc
Jul 24 09:48:59 ocv kernel: Kernel logging (proc) stopped.
Jul 24 09:48:59 ocv kernel: Kernel log daemon terminating.
Jul 24 09:49:00 ocv exiting on signal 15
Comment 1 Christian Volkmann 2007-07-26 02:22:50 UTC
Created attachment 2842 [details]
config active during the OOps.
Comment 2 Christian Volkmann 2007-07-26 04:17:29 UTC
Mounted shares:
( /mount/padwcl had been replaced by /mnt/ )

//server1a/departments 209712476 198344036  11368440  95% /mnt/1a/departments
//server1b/Temp        209712476 203021948   6690528  97% /mnt/1b/temp
//server1b/projectdoc_old 209712476 203021948 6690528 97% /mnt/1b/projectdoc_old
//server1a/home        157284348  95061124  62223224  61% /mnt/1a/home
//server1b/ProjectDoc  209712476 203021948   6690528  97% /mnt/1b/projectdoc
//server1b/Shared      209712476 203021948   6690528  97% /mnt/1b/shared
//server1a/systems     209712476 198344036  11368440  95% /mnt/1a/systems

The shares are from a windows 2003 server.
Comment 3 Christian Volkmann 2007-07-26 04:24:31 UTC
Created attachment 2843 [details]
All my collected OOps, some of them with a tainted kernel.
Comment 4 Christian Volkmann 2007-07-26 04:31:16 UTC
I played around with .config and compiled the kernel to get possibly some
more informations. 
And I forgot to save the old kernel and modules... :-(

Now I do not get any OOps any more. Not with the old nor with the changed
configuration. I will try again with a "make clean" and a new compilation.

Hmm, can this be a "temporary error" of the gcc ?

gcc -v
Using built-in specs.
Target: i586-suse-linux
Configured with: ../configure --enable-threads=posix --prefix=/usr --with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib --libexecdir=/usr/lib --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.1.2 --enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --enable-libstdcxx-allocator=new --program-suffix=-4.1 --enable-version-specific-runtime-libs --without-system-libunwind --with-cpu=generic --host=i586-suse-linux
Thread model: posix
gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)

59c59
< CONFIG_CC_OPTIMIZE_FOR_SIZE=y
---
> # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
65c65
< # CONFIG_KALLSYMS_ALL is not set
---
> CONFIG_KALLSYMS_ALL=y

After the new compilation I did not get the error any more.
Comment 5 Christian Volkmann 2007-07-26 14:24:00 UTC
Just an idea. Both threads seem to be at the same place.
May be it's a SMP race condition happening with multiple
mounts of the same server ?
Comment 6 Steve French 2007-07-26 14:50:50 UTC
A few things obvious from the back trace

"umount -a" is querying file system info (statfs), presumably to check if there filesystem type is cifs (presumably umount without -a would do that).  The server that you are mounted to is older (maybe Windows9x or WindowsME? or an older NAS) since it is not calling SMBQFSInfo or SMBUnixQFSInfo - that (old server) may be necessary for the recreation.
Comment 7 Christian Volkmann 2007-07-26 15:09:19 UTC
Hmm, smbclient -L server1a -U myuser -W XXX
Password:
Domain=[PAD] OS=[Windows Server 2003 3790 Service Pack 1] Server=[Windows Server 2003 5.2]
...


Our IT-Support told me win 2003. smbclient says the same.
Both server report the same version. 
Comment 8 Christian Volkmann 2007-07-26 16:27:19 UTC
(In reply to comment #6)
> A few things obvious from the back trace
> 
> "umount -a" is querying file system info (statfs), presumably to check if there
> filesystem type is cifs (presumably umount without -a would do that).  The
> server that you are mounted to is older (maybe Windows9x or WindowsME? or an
> older NAS) since it is not calling SMBQFSInfo or SMBUnixQFSInfo - that (old
> server) may be necessary for the recreation.
> 

Did you analyze one of the tainted OOps ? An old vmware-smb server might
have been up during that OOps. I hope I did not confuse you with comment #7
Comment 9 Steve French 2009-05-14 17:37:08 UTC
With the cifs mount rewrite that went in last year (Jeff Layton patch series), we think we have resolved all outstanding umount problems, but please reopen if it still fails (2.6.28 or later)