Hi! We've a Samba Host running on FC5 for a while now. But since late december it crashes since a new (fast and Gbit connected) W2k3 server does his backups. The host crashed about 5 times now after about 2 hours of backup. Since I've connected and configured the serial console today I've only a crashdump from today yet. First the setup. It's an P4 2.66GHz on a ASUS P5LD2-VM DH with 1GB RAM. Samba provides shares on a large Raid-5 array built with the intel onboard sata(ahci mode) controller and an additional Promise TX4 sata. here is the lspci output: 00:00.0 Host bridge: Intel Corporation 945G/P Memory Controller Hub (rev 02) 00:02.0 VGA compatible controller: Intel Corporation 945G Integrated Graphics Controller (rev 02) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GH (ICH7DH) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) Serial ATA Storage Controllers cc=AHCI (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:04.0 Mass storage controller: Integrated Technology Express, Inc. ITE 8211F Single Channel UDMA 133 (ASUS 8211 (ITE IT8212 ATA RAID Controller)) (rev 11) 01:0a.0 Mass storage controller: Promise Technology, Inc. PDC20718 (SATA 300 TX4) (rev 02) 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller kernels used while crashing: 2.6.18-1.2200.fc5 2.6.18-1.2257.fc5 additional kernel params: selinux=0 FC5 is yum updated to current state of today samba used while crashing: 3.0.23c-1.fc5 (fedora update rpms) 3.0.23d-1 (samba original rpms) The crashdump: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: *pde = 3448e067 Oops: 0000 [#1] last sysfs file: /devices/platform/i2c-9191/9191-0290/temp3_max_hyst Modules linked in: ipv6 autofs4 w83627ehf hwmon eeprom i2c_isa hidp l2cap bluetooth ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables raid456 xor video sbs i2c_ec container button battery asus_acpi ac lp parport_pc parport ehci_hcd uhci_hcd floppy sg snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device e1000 snd_pcm_oss snd_mixer_oss ide_cd i2c_i801 snd_pcm i2c_core serio_raw pcspkr cdrom snd_timer snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod raid1 ext3 jbd ahci sata_promise libata sd_mod scsi_mod CPU: 0 EIP: 0060:[<c04d278d>] Not tainted VLI EFLAGS: 00010246 (2.6.18-1.2257.fc5 #1) EIP is at rb_erase+0xf6/0x22f eax: 00000001 ebx: 00000000 ecx: 00000000 edx: f7be86c8 esi: f7be86c8 edi: f7be8448 ebp: c07946a0 esp: f7feff44 ds: 007b es: 007b ss: 0068 Process events/0 (pid: 4, ti=f7fef000 task=c18e05a0 task.ti=f7fef000) Stack: 00000001 f7be8440 f7be8448 f7e22740 00000282 c04a87cf c0669840 c0669844 c0428a48 c18e06c4 00000246 f7e22740 c04a8788 00000000 f7e22760 f7e22740 f7e22758 00000000 c0428f46 00000001 00000000 c18e15f0 00010000 00000000 Call Trace: [<c04a87cf>] key_cleanup+0x47/0xce [<c0428a48>] run_workqueue+0x85/0xc5 [<c0428f46>] worker_thread+0xe8/0x11a [<c042b1a5>] kthread+0xad/0xd8 [<c0403adf>] kernel_thread_helper+0x7/0x10 DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10 Leftover inexact backtrace: EIP: 0060:[<c04d278d>] Not tainted VLI EFLAGS: 00010246 (2.6.18-1.2257.fc5 #1) EIP is at rb_erase+0xf6/0x22f eax: 00000001 ebx: 00000000 ecx: 00000000 edx: f7be86c8 esi: f7be86c8 edi: f7be8448 ebp: c07946a0 esp: f7feff44 ds: 007b es: 007b ss: 0068 Process events/0 (pid: 4, ti=f7fef000 task=c18e05a0 task.ti=f7fef000) Stack: 00000001 f7be8440 f7be8448 f7e22740 00000282 c04a87cf c0669840 c0669844 c0428a48 c18e06c4 00000246 f7e22740 c04a8788 00000000 f7e22760 f7e22740 f7e22758 00000000 c0428f46 00000001 00000000 c18e15f0 00010000 00000000 Call Trace: [<c04a87cf>] key_cleanup+0x47/0xce [<c0428a48>] run_workqueue+0x85/0xc5 [<c0428f46>] worker_thread+0xe8/0x11a [<c042b1a5>] kthread+0xad/0xd8 [<c0403adf>] kernel_thread_helper+0x7/0x10 DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10 Leftover inexact backtrace: ======================= Code: 05 89 5a 08 eb 08 89 5a 04 eb 03 89 5d 00 83 3c 24 01 0f 85 46 01 00 00 e9 12 01 00 00 8b 4e 08 39 d9 0f 85 85 00 00 00 8b 4e 04 <8b> 01 a8 01 75 14 83 c8 01 89 ea 89 01 89 f0 83 26 fe e8 3c fd EIP: [<c04d278d>] rb_erase+0xf6/0x22f SS:ESP 0068:f7feff44 <0>BUG: spinlock lockup on CPU#0, smbd/3687, c0669780 (Not tainted) [<c0403f28>] dump_trace+0x69/0x1af [<c0404086>] show_trace_log_lvl+0x18/0x2c [<c0404601>] show_trace+0xf/0x11 [<c040468b>] dump_stack+0x15/0x17 [<c04d53fe>] _raw_spin_lock+0xbf/0xdc [<c04a84fb>] key_alloc+0x1d2/0x32f [<c04a9301>] keyring_alloc+0x30/0x6a [<c04aa995>] alloc_uid_keyring+0x4c/0xb2 [<c0423596>] alloc_uid+0x95/0x13b [<c04265d6>] set_user+0xb/0x8e [<c0427e63>] sys_setresuid+0x111/0x1dd [<c0402da7>] syscall_call+0x7/0xb DWARF2 unwinder stuck at syscall_call+0x7/0xb Leftover inexact backtrace: ======================= BUG: soft lockup detected on CPU#0! [<c0403f28>] dump_trace+0x69/0x1af [<c0404086>] show_trace_log_lvl+0x18/0x2c [<c0404601>] show_trace+0xf/0x11 [<c040468b>] dump_stack+0x15/0x17 [<c043f51c>] softlockup_tick+0x90/0xa1 [<c042319f>] update_process_times+0x35/0x57 [<c040631c>] timer_interrupt+0x58/0x90 [<c043f79e>] handle_IRQ_event+0x23/0x49 [<c043f846>] __do_IRQ+0x82/0xde [<c0405385>] do_IRQ+0x9a/0xb8 ... the "soft lockup" continues to dump every few seconds.... The share accessed by the W2k3 server is configured as... [backup] comment = backups go here path = /data/backup valid users = user1, user2 admin users = user1, user2 read list = user1, user2 write list = user1, user2 force user = backup force group = backup read only = No directory mask = 06775 force directory mode = 06770 If you need further details like the complete smb.conf, etc. ... please let me know. I'll provide it ASAP. Regards, Wolfgang Breyha
Sorry to be rude, but this can't be a Samba problem. If the kernel oopses, then it's a kernel problem. Volker