4321 – FC5 spinlock crash in smbd

Bug 4321 - FC5 spinlock crash in smbd

Summary: FC5 spinlock crash in smbd

Status:	RESOLVED INVALID

Alias:	None

Product:	Samba 3.0
Classification:	Unclassified
Component:	File Services (show other bugs)
Version:	3.0.23d
Hardware:	x86 Linux

Importance:	P3 normal
Target Milestone:	none
Assignee:	Samba Bugzilla Account
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2007-01-03 11:13 UTC by Wolfgang Breyha
Modified:	2007-01-03 11:23 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Wolfgang Breyha 2007-01-03 11:13:44 UTC

Hi!

We've a Samba Host running on FC5 for a while now. But since late december it crashes since a new (fast and Gbit connected) W2k3 server does his backups.

The host crashed about 5 times now after about 2 hours of backup. Since I've connected and configured the serial console today I've only a crashdump from today yet.

First the setup. It's an P4 2.66GHz on a ASUS P5LD2-VM DH with 1GB RAM. Samba provides shares on a large Raid-5 array built with the intel onboard sata(ahci mode) controller and an additional Promise TX4 sata.

here is the lspci output:
00:00.0 Host bridge: Intel Corporation 945G/P Memory Controller Hub (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 945G Integrated Graphics Controller (rev 02)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) Serial ATA Storage Controllers cc=AHCI (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:04.0 Mass storage controller: Integrated Technology Express, Inc. ITE 8211F Single Channel UDMA 133 (ASUS 8211 (ITE IT8212 ATA RAID Controller)) (rev 11)
01:0a.0 Mass storage controller: Promise Technology, Inc. PDC20718 (SATA 300 TX4) (rev 02)
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller


kernels used while crashing:
2.6.18-1.2200.fc5
2.6.18-1.2257.fc5
additional kernel params:
selinux=0

FC5 is yum updated to current state of today

samba used while crashing:
3.0.23c-1.fc5 (fedora update rpms)
3.0.23d-1 (samba original rpms)

The crashdump:
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
*pde = 3448e067
Oops: 0000 [#1]
last sysfs file: /devices/platform/i2c-9191/9191-0290/temp3_max_hyst
Modules linked in: ipv6 autofs4 w83627ehf hwmon eeprom i2c_isa hidp l2cap bluetooth ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables raid456 xor video sbs i2c_ec container button battery asus_acpi ac lp parport_pc parport ehci_hcd uhci_hcd floppy sg snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device e1000 snd_pcm_oss snd_mixer_oss ide_cd i2c_i801 snd_pcm i2c_core serio_raw pcspkr cdrom snd_timer snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod raid1 ext3 jbd ahci sata_promise libata sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c04d278d>]    Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2257.fc5 #1)
EIP is at rb_erase+0xf6/0x22f
eax: 00000001   ebx: 00000000   ecx: 00000000   edx: f7be86c8
esi: f7be86c8   edi: f7be8448   ebp: c07946a0   esp: f7feff44
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 4, ti=f7fef000 task=c18e05a0 task.ti=f7fef000)
Stack: 00000001 f7be8440 f7be8448 f7e22740 00000282 c04a87cf c0669840 c0669844
       c0428a48 c18e06c4 00000246 f7e22740 c04a8788 00000000 f7e22760 f7e22740
       f7e22758 00000000 c0428f46 00000001 00000000 c18e15f0 00010000 00000000
Call Trace:
 [<c04a87cf>] key_cleanup+0x47/0xce
 [<c0428a48>] run_workqueue+0x85/0xc5
 [<c0428f46>] worker_thread+0xe8/0x11a
 [<c042b1a5>] kthread+0xad/0xd8
 [<c0403adf>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
Leftover inexact backtrace:
EIP:    0060:[<c04d278d>]    Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2257.fc5 #1)
EIP is at rb_erase+0xf6/0x22f
eax: 00000001   ebx: 00000000   ecx: 00000000   edx: f7be86c8
esi: f7be86c8   edi: f7be8448   ebp: c07946a0   esp: f7feff44
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 4, ti=f7fef000 task=c18e05a0 task.ti=f7fef000)
Stack: 00000001 f7be8440 f7be8448 f7e22740 00000282 c04a87cf c0669840 c0669844
       c0428a48 c18e06c4 00000246 f7e22740 c04a8788 00000000 f7e22760 f7e22740
       f7e22758 00000000 c0428f46 00000001 00000000 c18e15f0 00010000 00000000
Call Trace:
 [<c04a87cf>] key_cleanup+0x47/0xce
 [<c0428a48>] run_workqueue+0x85/0xc5
 [<c0428f46>] worker_thread+0xe8/0x11a
 [<c042b1a5>] kthread+0xad/0xd8
 [<c0403adf>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
Leftover inexact backtrace:
 =======================
Code: 05 89 5a 08 eb 08 89 5a 04 eb 03 89 5d 00 83 3c 24 01 0f 85 46 01 00 00 e9 12 01 00 00 8b 4e 08 39 d9 0f 85 85 00 00 00 8b 4e 04 <8b> 01 a8 01 75 14 83 c8 01 89 ea 89 01 89 f0 83 26 fe e8 3c fd
EIP: [<c04d278d>] rb_erase+0xf6/0x22f SS:ESP 0068:f7feff44
 <0>BUG: spinlock lockup on CPU#0, smbd/3687, c0669780 (Not tainted)
 [<c0403f28>] dump_trace+0x69/0x1af
 [<c0404086>] show_trace_log_lvl+0x18/0x2c
 [<c0404601>] show_trace+0xf/0x11
 [<c040468b>] dump_stack+0x15/0x17
 [<c04d53fe>] _raw_spin_lock+0xbf/0xdc
 [<c04a84fb>] key_alloc+0x1d2/0x32f
 [<c04a9301>] keyring_alloc+0x30/0x6a
 [<c04aa995>] alloc_uid_keyring+0x4c/0xb2
 [<c0423596>] alloc_uid+0x95/0x13b
 [<c04265d6>] set_user+0xb/0x8e
 [<c0427e63>] sys_setresuid+0x111/0x1dd
 [<c0402da7>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:
 =======================
BUG: soft lockup detected on CPU#0!
 [<c0403f28>] dump_trace+0x69/0x1af
 [<c0404086>] show_trace_log_lvl+0x18/0x2c
 [<c0404601>] show_trace+0xf/0x11
 [<c040468b>] dump_stack+0x15/0x17
 [<c043f51c>] softlockup_tick+0x90/0xa1
 [<c042319f>] update_process_times+0x35/0x57
 [<c040631c>] timer_interrupt+0x58/0x90
 [<c043f79e>] handle_IRQ_event+0x23/0x49
 [<c043f846>] __do_IRQ+0x82/0xde
 [<c0405385>] do_IRQ+0x9a/0xb8

... the "soft lockup" continues to dump every few seconds....

The share accessed by the W2k3 server is configured as...
[backup]
        comment = backups go here
        path = /data/backup
        valid users = user1, user2
        admin users = user1, user2
        read list = user1, user2
        write list = user1, user2
        force user = backup
        force group = backup
        read only = No
        directory mask = 06775
        force directory mode = 06770

If you need further details like the complete smb.conf, etc. ... please let me know. I'll provide it ASAP.

Regards, Wolfgang Breyha

Comment 1 Volker Lendecke 2007-01-03 11:23:00 UTC

Sorry to be rude, but this can't be a Samba problem. If the kernel oopses, then it's a kernel problem.

Volker