Bug 11861 - smbd soft lockup when starting matlab pool on client
smbd soft lockup when starting matlab pool on client
Status: RESOLVED INVALID
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.4.2
x64 Linux
: P5 normal
: ---
Assigned To: Samba QA Contact
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-04-20 19:45 UTC by Tom Burkholder
Modified: 2016-04-20 20:08 UTC (History)
0 users

See Also:


Attachments
testparm -v and log level 3, slightly redacted (470.05 KB, text/plain)
2016-04-20 19:45 UTC, Tom Burkholder
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Burkholder 2016-04-20 19:45:59 UTC
Created attachment 12011 [details]
testparm -v and log level 3, slightly redacted

Samba 4.4.0 and 4.4.2, hosting NT-domain with openLDAP backend
Samba host : Ubuntu 12.04.5
uname -a : Linux muscle 3.2.0-101-generic #141-Ubuntu SMP Thu Mar 10 21:43:24 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Client: Windows 7/x64

Steps to replicate:
1) log in to domain
2) start matlab 2011b
3) 'matlabpool open local'

matlabpool launches 4 matlab processes on the client (win), and I don't understand why this should affect the server.  Each client process needs to access some config files on the server and crawls the PATH directories on the server.

The attached log 3 is a complete user session from log-on to crash.  I've removed much of what seems to be directory crawling.  The matlabpool open event starts around 14:09:43

This results in kernel 'soft lockup' sometimes in smbd process, sometimes in smbd-notifyd.  Kernel messages:

Apr 20 14:11:59 muscle kernel: [ 1688.812058] BUG: soft lockup - CPU#8 stuck for 22s! [smbd-notifyd:3998]
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt xt_multiport pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) ipt_LOG iptable_nat xt_conntrack xt_limit xt_state xt_tcpudp iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables nf_conntrack parport_pc ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc nls_utf8 isofs hfs dm_crypt ipmi_devintf psmouse joydev serio_raw edac_core fam15h_power k10temp edac_mce_amd i2c_piix4 shpchp mac_hid ipmi_si ipmi_msghandler lp parport vesafb usbhid hid pata_atiixp igb dca
Apr 20 14:11:59 muscle kernel: [ 1688.816001] CPU 8
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt xt_multiport pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) ipt_LOG iptable_nat xt_conntrack xt_limit xt_state xt_tcpudp iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables nf_conntrack parport_pc ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc nls_utf8 isofs hfs dm_crypt ipmi_devintf psmouse joydev serio_raw edac_core fam15h_power k10temp edac_mce_amd i2c_piix4 shpchp mac_hid ipmi_si ipmi_msghandler lp parport vesafb usbhid hid pata_atiixp igb dca
Apr 20 14:11:59 muscle kernel: [ 1688.816001]
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Pid: 3998, comm: smbd-notifyd Tainted: G           O 3.2.0-101-generic #141-Ubuntu Supermicro H8QG6/H8QG6
Apr 20 14:11:59 muscle kernel: [ 1688.816001] RIP: 0010:[<ffffffff8103eb82>]  [<ffffffff8103eb82>] __ticket_spin_lock+0x22/0x30
Apr 20 14:11:59 muscle kernel: [ 1688.816001] RSP: 0018:ffff88021a05de18  EFLAGS: 00000206
Apr 20 14:11:59 muscle kernel: [ 1688.816001] RAX: 00000000000003c7 RBX: ffffffff81669e5e RCX: 00000000389293b0
Apr 20 14:11:59 muscle kernel: [ 1688.816001] RDX: 00000000000003c2 RSI: ffff880409e10340 RDI: ffff880409e105f0
Apr 20 14:11:59 muscle kernel: [ 1688.816001] RBP: ffff88021a05de18 R08: 0000000000000004 R09: 0000000000000000
Apr 20 14:11:59 muscle kernel: [ 1688.816001] R10: ffff8804082bf500 R11: 0000000000000004 R12: ffff88040816f900
Apr 20 14:11:59 muscle kernel: [ 1688.816001] R13: ffff88040816f900 R14: ffff880215ca7500 R15: 0000000000000000
Apr 20 14:11:59 muscle kernel: [ 1688.816001] FS:  00007f5afad62740(0000) GS:ffff880427a00000(0000) knlGS:0000000000000000
Apr 20 14:11:59 muscle kernel: [ 1688.816001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 20 14:11:59 muscle kernel: [ 1688.816001] CR2: 00007f5afc56d148 CR3: 0000000216baf000 CR4: 00000000000406e0
Apr 20 14:11:59 muscle kernel: [ 1688.816001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 20 14:11:59 muscle kernel: [ 1688.816001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Process smbd-notifyd (pid: 3998, threadinfo ffff88021a05c000, task ffff880217cc1700)
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Stack:
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  ffff88021a05de28 ffffffff81669e5e ffff88021a05de58 ffffffff815dcbfc
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  ffff88040f9cc100 ffff880418f8e900 000000000000001f ffff88040f9cc100
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  ffff88021a05deb8 ffffffff815de704 ffff88021a05de84 ffffffff81c9fd00
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Call Trace:
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81669e5e>] _raw_spin_lock+0xe/0x20
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815dcbfc>] unix_state_double_lock+0x2c/0x70
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815de704>] unix_dgram_connect+0x94/0x220
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81539d9b>] sys_connect+0xeb/0x110
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff8118d696>] ? sys_fcntl+0x76/0xa0
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81672562>] system_call_fastpath+0x16/0x1b
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Code: f5 fe ff ff 90 90 90 90 90 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 74 13 66 0f 1f 84 00 00 00 00 00 f3 90 <0f> b7 07 66 39 d0 75 f6 5d c3 0f 1f 40 00 8b 17 55 31 c0 48 89
Apr 20 14:11:59 muscle kernel: [ 1688.816001] Call Trace:
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81669e5e>] _raw_spin_lock+0xe/0x20
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815dcbfc>] unix_state_double_lock+0x2c/0x70
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815de704>] unix_dgram_connect+0x94/0x220
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815dcbfc>] unix_state_double_lock+0x2c/0x70
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff815de704>] unix_dgram_connect+0x94/0x220
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81539d9b>] sys_connect+0xeb/0x110
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff8118d696>] ? sys_fcntl+0x76/0xa0
Apr 20 14:11:59 muscle kernel: [ 1688.816001]  [<ffffffff81672562>] system_call_fastpath+0x16/0x1b
Comment 1 Volker Lendecke 2016-04-20 20:08:18 UTC
This sounds strikingly similar to

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1543980

although the kernel versions do not fully match. You should check with canonical.

Apart from that -- a kernel lockup can never be a samba bug. Closing as invalid, this is definitely a kernel bug.