After upgrading to Samba 4.18.0 a couple of weeks ago I've started seeing some smbd processes that seems to have deadlocked with a closed TCP session (according to "netstat" - state is "CLOSED") and waiting in: tdb_nest_lock -> tdb_brlock -> fcntl_lock -> fcntl This is a bit of a problem if this happens to a client that have locks held on some files at the moment it occurs. Manually killing the problematic smbd processes "fixes" the issue and I have a monitoring script running that now watches for when this happens and alerts me. The strange thing is that it seems to happen not all the time but after some 4-10 days, and it doesn't happen on all our servers (very busy ones) but just two (so far). I can't currently upgrade to 4.18.2 on the servers where this occurs, but I can't anyway see anything in the release notes for 4.18.1 or 4.18.2 that looks like it be related to a fix. I've downgraded one of them to 4.17 now for the moment. Here is a part of a GDB backtrace from a running deadlocked smbd process when this has happened. 0x00000008049affc8 in _fcntl () from /lib/libc.so.7 #0 0x00000008049affc8 in _fcntl () from /lib/libc.so.7 #1 0x0000000807aab8d6 in ?? () from /lib/libthr.so.3 #2 0x00000008048845b7 in fcntl () from /lib/libc.so.7 #3 0x0000000804bf0e5c in fcntl_lock (waitflag=true, len=1, off=7608, rw=1, tdb=0x80f32ed20) at ../../lib/tdb/common/lock.c:58 #4 tdb_brlock (tdb=tdb@entry=0x80f32ed20, rw_type=rw_type@entry=1, offset=offset@entry=7608, len=len@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:200 #5 0x0000000804bf142d in tdb_nest_lock (tdb=tdb@entry=0x80f32ed20, offset=offset@entry=7608, ltype=ltype@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:390 #6 0x0000000804bf1659 in tdb_lock_list (tdb=tdb@entry=0x80f32ed20, list=list@entry=1860, ltype=ltype@entry=1, waitflag=waitflag@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:482 #7 0x0000000804bf1720 in tdb_lock (tdb=tdb@entry=0x80f32ed20, list=1860, ltype=ltype@entry=1) at ../../lib/tdb/common/lock.c:500 #8 0x0000000804beeb82 in tdb_find_lock_hash (tdb=tdb@entry=0x80f32ed20, key=..., hash=2009077229, locktype=locktype@entry=1, rec=rec@entry=0x7fffffffd960) at ../../lib/tdb/common/tdb.c:165 #9 0x0000000804beed65 in tdb_parse_record (tdb=0x80f32ed20, key=..., parser=parser@entry=0x80589df7d <db_tdb_parser>, private_data=private_data@entry=0x7fffffffd9b0) at ../../lib/tdb/common/tdb.c:329 #10 0x000000080589e2e2 in db_tdb_parse (db=<optimized out>, key=..., parser=0x802108824 <dbwrap_watched_parse_record_parser>, private_data=0x7fffffffda20) at ../../lib/dbwrap/dbwrap_tdb.c:273 #11 0x000000080589ba5f in dbwrap_parse_record (db=<optimized out>, key=..., parser=parser@entry=0x802108824 <dbwrap_watched_parse_record_parser>, private_data=private_data@entry=0x7fffffffda20) at ../../lib/dbwrap/dbwrap.c:425 #12 0x000000080210893b in dbwrap_watched_parse_record (db=0x80f3ffb60, key=..., parser=0x80210daf8 <g_lock_dump_fn>, private_data=0x7fffffffdaa0) at ../../source3/lib/dbwrap/dbwrap_watch.c:783 #13 0x000000080589ba5f in dbwrap_parse_record (db=<optimized out>, key=..., parser=parser@entry=0x80210daf8 <g_lock_dump_fn>, private_data=private_data@entry=0x7fffffffdaa0) at ../../lib/dbwrap/dbwrap.c:425 #14 0x000000080210f129 in g_lock_dump (ctx=<optimized out>, key=..., fn=fn@entry=0x8016d0496 <fetch_share_mode_unlocked_parser>, private_data=private_data@entry=0x7fffffffdb00) at ../../source3/lib/g_lock.c:1653 #15 0x00000008016d1799 in fetch_share_mode_unlocked (mem_ctx=<optimized out>, id=...) at ../../source3/locking/share_mode_lock.c:1567 #16 0x00000008016c6b0e in get_file_infos (id=..., name_hash=name_hash@entry=0, delete_on_close=delete_on_close@entry=0x0, write_time=write_time@entry=0x7fffffffdc80) at ../../source3/locking/locking.c:652 #17 0x00000008016ea7d5 in smbd_dirptr_get_entry (ctx=ctx@entry=0x819095050, dirptr=dirptr@entry=0x80f3a12e0, mask=mask@entry=0x819094e00 "*", dirtype=dirtype@entry=22, dont_descend=dont_descend@entry=false, ask_sharemode=ask_sharemode@entry=true, get_dosmode_in=true, match_fn=0x8016f36d3 <smbd_dirptr_lanman2_match_fn>, mode_fn=0x8016f336b <smbd_dirptr_lanman2_mode_fn>, private_data=0x7fffffffdf80, _fname=0x7fffffffdf50, _smb_fname=0x7fffffffdf58, _mode=0x7fffffffdf44, _prev_offset=0x7fffffffdf48) at ../../source3/smbd/dir.c:1050
The important trace will not be of the process 'stuck' in fcntl, but the process that holds the lock it wants, which won't be in fcntl (it will be somewhere else, busy doing something).
Ok, I'll try to catch backtraces of all smbd's (some 400 or so) next time this occurs (which should be in a couple of days if it happens with the same frequency) and we'll see if I can find something interresting :-)
Ok, I'm now got a new deadlocked pair of smbd's on another server. I "gcore" all running smbd processes and got a "gdb bt" backtrace from them all. root@filur02:/var/tmp/smbd # /liu/sbin/smbreaper SUSPECT: pid=48592 user=mihha273 locks=1 from=2001:6b0:17:ac00:1000::ed11.63854 started=Fri May 19 17:22:41 2023 CEST SUSPECT: pid=51187 user=axema002 locks=1 from=2001:6b0:17:ac00:2000::58fb.57873 started=Fri May 19 14:11:27 2023 CEST Those two and a third one is the only ones sitting at fcntl_lock, root@filur02:/var/tmp/smbd # grep '^#[23] ' *.bt | egrep -v poll_event_loop core.48592.bt:#2 0x00000008048845b7 in fcntl () from /lib/libc.so.7 core.48592.bt:#3 0x0000000804bf0e5c in fcntl_lock (waitflag=true, len=1, off=38356, rw=1, core.51187.bt:#2 0x00000008048845b7 in fcntl () from /lib/libc.so.7 core.51187.bt:#3 0x0000000804bf0e5c in fcntl_lock (waitflag=true, len=1, off=38356, rw=1, core.75528.bt:#2 0x00000008048845b7 in fcntl () from /lib/libc.so.7 core.75528.bt:#3 0x0000000804bf0e5c in fcntl_lock (waitflag=true, len=1, off=38356, rw=1, root@filur02:/var/tmp/smbd # smbstatus|egrep 75528 75528 nikha067 student-liu.se 2001:6b0:17:ac00:1000::5317 (ipv6:2001:6b0:17:ac00:1000::5317:50595) SMB3_11 - partial(AES-128-CMAC) IPC$ 75528 2001:6b0:17:ac00:1000::5317 Fri May 19 17:35:43 2023 CEST - AES-128-CMAC students 75528 2001:6b0:17:ac00:1000::5317 Fri May 19 11:14:39 2023 CEST - AES-128-CMAC 75528 1000095530 DENY_NONE 0x100081 RDONLY NONE /export/students nikha067/SettingsPackages/Microsoft-Windows-Powershell Fri May 19 17:35:48 2023 The other two have a lot of files open according to "smbstatus". I can wait a little with restarting the smbd processes for now if you have some ideas on what to test. Btw, on the other server where I was experiencing this I backed down to 4.17.6 and there I haven't seen this issue again. lsof output for this process: oot@filur02:/var/tmp/smbd # lsof -p 75528 lsof: WARNING: compiled for FreeBSD release 12.3-RELEASE-p7; this is 12.3-RELEASE-p6. lsof: WARNING: device cache mismatch: /dev/ttyu1 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME smbd 75528 nikha067 txt VREG 1441510162,2280381956 212224 297721 /liu/pkg/samba/4.18.0-liu/sbin/smbd (zroot/ROOT/default) smbd 75528 nikha067 cwd VDIR 620489695,2065293036 34939 4 /export/students (DATA/students) smbd 75528 nikha067 rtd VDIR 1441510162,2280381956 32 4 / (zroot/ROOT/default) smbd 75528 nikha067 0u VCHR 0,14 0t0 14 /dev/null (devfs) smbd 75528 nikha067 1u VCHR 0,14 0t0 14 /dev/null (devfs) smbd 75528 nikha067 2w VREG 1441510162,2280381956 3576637 457153 /liu/var/samba/logs/log.smbd (zroot/ROOT/default) smbd 75528 nikha067 3u unix 0xfffff85aa6856368 0t0 ->0xfffff8427762a6d0 smbd 75528 nikha067 4u VREG 1441510162,2280381956 430080 210017 /liu/etc/samba/private/secrets.tdb (zroot/ROOT/default) smbd 75528 nikha067 5uW VREG 1441510162,2280381956 20 412038 /liu/var/samba/locks/msg.lock/75528 (zroot/ROOT/default) smbd 75528 nikha067 6u unix 0xfffff80562305a38 0t0 /liu/etc/samba/private/msg.sock/75528 smbd 75528 nikha067 7u VREG 1441510162,2280381956 12288 114723 /liu/var/samba/locks/names.tdb (zroot/ROOT/default) smbd 75528 nikha067 8r VDIR 1355234787,1513152036 99 50 /export/students/nikha067/SettingsPackages/Microsoft-Windows-Powershell (DATA/students/nikha067) smbd 75528 nikha067 9u VCHR 0,14 0t0 14 /dev/null (devfs) smbd 75528 nikha067 10u VCHR 0,14 0t0 14 /dev/null (devfs) smbd 75528 nikha067 11w VREG 1441510162,2280381956 6 451482 /liu/var/samba/run/smbd.pid (zroot/ROOT/default) smbd 75528 nikha067 12u PIPE 0xfffff85228333000 16384 ->0xfffff85228333168 smbd 75528 nikha067 13u PIPE 0xfffff83dc4835000 16384 ->0xfffff83dc4835168 smbd 75528 nikha067 14u PIPE 0xfffff83dc4835168 4096 ->0xfffff83dc4835000 smbd 75528 nikha067 15w VREG 1441510162,2280381956 3576637 457153 /liu/var/samba/logs/log.smbd (zroot/ROOT/default) smbd 75528 nikha067 16u VREG 1441510162,2280381956 16384 114728 /liu/var/samba/locks/smbXsrv_version_global.tdb (zroot/ROOT/default) smbd 75528 nikha067 17u VREG 1441510162,2280381956 253952 114769 /liu/var/samba/locks/smbXsrv_client_global.tdb (zroot/ROOT/default) smbd 75528 nikha067 18u VREG 1441510162,2280381956 115605504 321458 /liu/var/samba/locks/smbXsrv_session_global.tdb (zroot/ROOT/default) smbd 75528 nikha067 19u VREG 1441510162,2280381956 2306048 114730 /liu/var/samba/locks/smbXsrv_tcon_global.tdb (zroot/ROOT/default) smbd 75528 nikha067 20u VREG 1441510162,2280381956 1474560 114731 /liu/var/samba/locks/brlock.tdb (zroot/ROOT/default) smbd 75528 nikha067 21u VREG 1441510162,2280381956 149245952 451485 /liu/var/samba/locks/locking.tdb (zroot/ROOT/default) smbd 75528 nikha067 22u VREG 1441510162,2280381956 40200 114736 /liu/var/samba/locks/leases.tdb (zroot/ROOT/default) smbd 75528 nikha067 23u VREG 1441510162,2280381956 421888 114751 /liu/var/samba/state/share_info.tdb (zroot/ROOT/default) smbd 75528 nikha067 24u VREG 1441510162,2280381956 53071872 269463 /liu/var/samba/locks/gencache.tdb (zroot/ROOT/default) smbd 75528 nikha067 25u unix 0xfffff8054a0a6000 0t0 ->(none) smbd 75528 nikha067 26u VREG 1441510162,2280381956 425984 114755 /liu/var/samba/state/group_mapping.tdb (zroot/ROOT/default) smbd 75528 nikha067 27u VREG 1441510162,2280381956 421888 114758 /liu/var/samba/state/account_policy.tdb (zroot/ROOT/default) smbd 75528 nikha067 28u VREG 1441510162,2280381956 421888 210015 /liu/etc/samba/private/passdb.tdb (zroot/ROOT/default) smbd 75528 nikha067 29u VREG 1441510162,2280381956 11005952 114759 /liu/var/samba/locks/smbXsrv_open_global.tdb (zroot/ROOT/default) smbd 75528 nikha067 30r VREG 1355234787,1513152036 27208 69865 /export/students/nikha067/SettingsPackages/Microsoft-Windows-Powershell/12CA888A-96C3-4D71-B309-0311B54A61BE.pkgdat (DATA/students/nikha067) smbd 75528 nikha067 31u unix 0xfffff85aa68566d0 0t0 ->0xfffff8730445c368 smbd 75528 nikha067 34u IPv6 0xfffff83f621ccb88 0t0 TCP filur02.it.liu.se:microsoft-ds->gimle-405.ad.liu.se:50595 (CLOSED) smbd 75528 nikha067 36u VREG 1441510162,2280381956 12288 114778 /liu/var/samba/locks/smbd_cleanupd.tdb (zroot/ROOT/default) And a more detailed backtrace: (gdb) bt #0 0x00000008049affc8 in _fcntl () from /lib/libc.so.7 #1 0x0000000807aab8d6 in ?? () from /lib/libthr.so.3 #2 0x00000008048845b7 in fcntl () from /lib/libc.so.7 #3 0x0000000804bf0e5c in fcntl_lock (waitflag=true, len=1, off=38356, rw=1, tdb=0x80f32ed20) at ../../lib/tdb/common/lock.c:58 #4 tdb_brlock (tdb=tdb@entry=0x80f32ed20, rw_type=rw_type@entry=1, offset=offset@entry=38356, len=len@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:200 #5 0x0000000804bf142d in tdb_nest_lock (tdb=tdb@entry=0x80f32ed20, offset=offset@entry=38356, ltype=ltype@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:390 #6 0x0000000804bf1659 in tdb_lock_list (tdb=tdb@entry=0x80f32ed20, list=list@entry=9547, ltype=ltype@entry=1, waitflag=waitflag@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:482 #7 0x0000000804bf1720 in tdb_lock (tdb=tdb@entry=0x80f32ed20, list=9547, ltype=ltype@entry=1) at ../../lib/tdb/common/lock.c:500 #8 0x0000000804beeb82 in tdb_find_lock_hash (tdb=tdb@entry=0x80f32ed20, key=..., hash=3111656176, locktype=locktype@entry=1, rec=rec@entry=0x7fffffffd960) at ../../lib/tdb/common/tdb.c:165 #9 0x0000000804beed65 in tdb_parse_record (tdb=0x80f32ed20, key=..., parser=parser@entry=0x80589df7d <db_tdb_parser>, private_data=private_data@entry=0x7fffffffd9b0) at ../../lib/tdb/common/tdb.c:329 #10 0x000000080589e2e2 in db_tdb_parse (db=<optimized out>, key=..., parser=0x802108824 <dbwrap_watched_parse_record_parser>, private_data=0x7fffffffda20) at ../../lib/dbwrap/dbwrap_tdb.c:273 #11 0x000000080589ba5f in dbwrap_parse_record (db=<optimized out>, key=..., parser=parser@entry=0x802108824 <dbwrap_watched_parse_record_parser>, private_data=private_data@entry=0x7fffffffda20) at ../../lib/dbwrap/dbwrap.c:425 #12 0x000000080210893b in dbwrap_watched_parse_record (db=0x80f3ffb60, key=..., parser=0x80210daf8 <g_lock_dump_fn>, private_data=0x7fffffffdaa0) at ../../source3/lib/dbwrap/dbwrap_watch.c:783 #13 0x000000080589ba5f in dbwrap_parse_record (db=<optimized out>, key=..., parser=parser@entry=0x80210daf8 <g_lock_dump_fn>, private_data=private_data@entry=0x7fffffffdaa0) at ../../lib/dbwrap/dbwrap.c:425 #14 0x000000080210f129 in g_lock_dump (ctx=<optimized out>, key=..., fn=fn@entry=0x8016d0496 <fetch_share_mode_unlocked_parser>, private_data=private_data@entry=0x7fffffffdb00) at ../../source3/lib/g_lock.c:1653 #15 0x00000008016d1799 in fetch_share_mode_unlocked (mem_ctx=<optimized out>, id=...) at ../../source3/locking/share_mode_lock.c:1567 #16 0x00000008016c6b0e in get_file_infos (id=..., name_hash=name_hash@entry=0, delete_on_close=delete_on_close@entry=0x0, write_time=write_time@entry=0x7fffffffdc80) at ../../source3/locking/locking.c:652 #17 0x00000008016ea7d5 in smbd_dirptr_get_entry (ctx=ctx@entry=0x80f3b7640, dirptr=dirptr@entry=0x80f3a0da0, mask=mask@entry=0x80f32d0a0 "*", dirtype=dirtype@entry=22, dont_descend=dont_descend@entry=false, ask_sharemode=ask_sharemode@entry=true, get_dosmode_in=true, match_fn=0x8016f36d3 <smbd_dirptr_lanman2_match_fn>, mode_fn=0x8016f336b <smbd_dirptr_lanman2_mode_fn>, private_data=0x7fffffffdf80, _fname=0x7fffffffdf50, _smb_fname=0x7fffffffdf58, _mode=0x7fffffffdf44, _prev_offset=0x7fffffffdf48) at ../../source3/smbd/dir.c:1050 #18 0x00000008016f75ae in smbd_dirptr_lanman2_entry (ctx=ctx@entry=0x80f3b7640, conn=0x80f3a8c60, dirptr=0x80f3a0da0, flags2=<optimized out>, path_mask=<optimized out>, dirtype=22, info_level=260, requires_resume_key=0, dont_descend=false, ask_sharemode=true, get_dosmode=true, align=8 '\b', do_pad=false, ppdata=0x80f3b76a0, base_data=0x80f37c060 "\270", end_data=0x80f37e057 "", space_remaining=2988, _smb_fname=0x7fffffffe0f0, got_exact_match=0x7fffffffe0ee, _last_entry_off=0x80f3b76c4, name_list=0x0, file_id=0x7fffffffe100) at ../../source3/smbd/smb2_trans2.c:1958 #19 0x0000000801754d03 in smb2_query_directory_next_entry (req=req@entry=0x80f3b7480) at ../../source3/smbd/smb2_query_directory.c:561 #20 0x000000080175639b in smbd_smb2_query_directory_send (in_file_name=<optimized out>, in_output_buffer_length=<optimized out>, in_file_index=<optimized out>, in_flags=0 '\000', in_file_info_class=3 '\003', fsp=0x80f376720, smb2req=0x813118060, ev=0x80f3a8060, mem_ctx=0x813118060) at ../../source3/smbd/smb2_query_directory.c:529
Please attach the backtraces of all three processes.
Created attachment 17892 [details] Backtrace of process 48592
Created attachment 17893 [details] Backtrace of process 51187
Created attachment 17894 [details] Backtrace of process 75528
Created attachment 17895 [details] tar archive of lsof and smbstatus output for the running processes
Still the same issue with 4.18.7 I just caught three smbd processes deadlocked trying to aquire a fcntl() lock for the "locking.tdb" database. Killing all three fixes to problem and Samba will happily continue serving clients. If I don't kill them then smbd processes will start to accumulate. I tried looking for something else keeping a lock on the locking.tdb database but none was found. I'll upload a tar archive of gdb backtraces & lsof output for all three processes (and the smb.conf file). A "tdbtool locking.tdb check" finds no errors. Two of the three processes references full_audit in their back traces but they seem to be newer (higher pid numbers) than the first one which seems to be busy doing some directory listing? This is a server for home directories so the users *shouldn't* be accessing some common directory but their own directories only. Hmm.. The user that owns the first smbd that is doing the directory listing seems to have a *lot* of files & directories in his Home... 241342 directories, 235383 files, 15 levels deep. But it should not be blocking just because of that. Hmm...
Created attachment 18150 [details] Tar archive with gdb backtrace and lsof outputs
Happened again. This time I think I found the real smbd that was holding the lock and blocking the others. It was a smbd running as "root" but sitting at the poll_even_loop_poll(). However an "lsof" on it located that it was having "locking.tdb" open with a read lock on it. (gdb) bt #0 0x000019ec86180f5a in _poll () from /lib/libc.so.7 #1 0x000019ec9436fd36 in ?? () from /lib/libthr.so.3 #2 0x000019ec7cb0082f in poll_event_loop_poll (tvalp=0x19ec75c83ca0, ev=0x19ecc1c1b060) at ../../lib/tevent/tevent_poll.c:483 #3 poll_event_loop_once (ev=0x19ecc1c1b060, location=<optimized out>) at ../../lib/tevent/tevent_poll.c:626 #4 0x000019ec7cafd77e in _tevent_loop_once (ev=ev@entry=0x19ecc1c1b060, location=location@entry=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:823 #5 0x000019ec7cafd97b in tevent_common_loop_wait (ev=0x19ecc1c1b060, location=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:949 #6 0x000019ec7cafd9dd in _tevent_loop_wait (ev=ev@entry=0x19ecc1c1b060, location=location@entry=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:968 #7 0x000019e45544672c in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x19ecc1c1b060) at ../../source3/smbd/server.c:1373 #8 main (argc=<optimized out>, argv=<optimized out>) at ../../source3/smbd/server.c:2130 (gdb) #0 0x000019ec86180f5a in _poll () from /lib/libc.so.7 #1 0x000019ec9436fd36 in ?? () from /lib/libthr.so.3 #2 0x000019ec7cb0082f in poll_event_loop_poll (tvalp=0x19ec75c83ca0, ev=0x19ecc1c1b060) at ../../lib/tevent/tevent_poll.c:483 #3 poll_event_loop_once (ev=0x19ecc1c1b060, location=<optimized out>) at ../../lib/tevent/tevent_poll.c:626 #4 0x000019ec7cafd77e in _tevent_loop_once (ev=ev@entry=0x19ecc1c1b060, location=location@entry=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:823 #5 0x000019ec7cafd97b in tevent_common_loop_wait (ev=0x19ecc1c1b060, location=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:949 #6 0x000019ec7cafd9dd in _tevent_loop_wait (ev=ev@entry=0x19ecc1c1b060, location=location@entry=0x19e455449368 "../../source3/smbd/server.c:1373") at ../../lib/tevent/tevent.c:968 #7 0x000019e45544672c in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x19ecc1c1b060) at ../../source3/smbd/server.c:1373 #8 main (argc=<optimized out>, argv=<optimized out>) at ../../source3/smbd/server.c:2130 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME smbd 39610 root txt VREG 1441510162,2280381956 211424 358328 /liu/pkg/samba/4.18.7/sbin/smbd (zroot/ROOT/default) smbd 39610 root cwd VDIR 1441510162,2280381956 34 4 / (zroot/ROOT/default) smbd 39610 root rtd VDIR 1441510162,2280381956 34 4 / (zroot/ROOT/default) smbd 39610 root 0u VCHR 0,41 0t0 41 /dev/null (devfs) smbd 39610 root 1u VCHR 0,41 0t0 41 /dev/null (devfs) smbd 39610 root 2w VREG 1441510162,2280381956 5120299 272757 / (zroot/ROOT/default) smbd 39610 root 3u unix 0xfffff841a89d5b10 0t0 ->0xfffff801ec6e7000 smbd 39610 root 4u VREG 1441510162,2280381956 430080 210017 /liu/etc/samba/private/secrets.tdb (zroot/ROOT/default) smbd 39610 root 5uW VREG 1441510162,2280381956 21 357390 /liu/var/samba/locks/msg.lock/39610 (zroot/ROOT/default) smbd 39610 root 6u unix 0xfffff8021b9c53b0 0t0 /liu/etc/samba/private/msg.sock/39610 smbd 39610 root 7ur VREG 1441510162,2280381956 12288 114723 /liu/var/samba/locks/names.tdb (zroot/ROOT/default) smbd 39610 root 8w VREG 1441510162,2280381956 5120299 272757 / (zroot/ROOT/default) smbd 39610 root 9u VCHR 0,41 0t0 41 /dev/null (devfs) smbd 39610 root 10u VCHR 0,41 0t0 41 /dev/null (devfs) smbd 39610 root 11wW VREG 1441510162,2280381956 6 357389 /liu/var/samba/run/smbd.pid (zroot/ROOT/default) smbd 39610 root 12u PIPE 0xfffff8192b3ce5d0 16384 ->0xfffff8192b3ce730 smbd 39610 root 13u PIPE 0xfffff8192b3ce730 4096 ->0xfffff8192b3ce5d0 smbd 39610 root 14u EVENTFD value=0, flags=0x4 smbd 39610 root 15ur VREG 1441510162,2280381956 16384 114728 /liu/var/samba/locks/smbXsrv_version_global.tdb (zroot/ROOT/default) smbd 39610 root 16ur VREG 1441510162,2280381956 233472 114769 /liu/var/samba/locks/smbXsrv_client_global.tdb (zroot/ROOT/default) smbd 39610 root 17ur VREG 1441510162,2280381956 119623680 321458 /liu/var/samba/locks/smbXsrv_session_global.tdb (zroot/ROOT/default) smbd 39610 root 18ur VREG 1441510162,2280381956 2306048 114730 /liu/var/samba/locks/smbXsrv_tcon_global.tdb (zroot/ROOT/default) smbd 39610 root 19ur VREG 1441510162,2280381956 2883584 114731 /liu/var/samba/locks/brlock.tdb (zroot/ROOT/default) smbd 39610 root 20ur VREG 1441510162,2280381956 122077184 357392 /liu/var/samba/locks/locking.tdb (zroot/ROOT/default) smbd 39610 root 21ur VREG 1441510162,2280381956 40200 114736 /liu/var/samba/locks/leases.tdb (zroot/ROOT/default) smbd 39610 root 22u VREG 1441510162,2280381956 421888 114751 /liu/var/samba/state/share_info.tdb (zroot/ROOT/default) smbd 39610 root 23u VREG 1441510162,2280381956 53071872 269463 /liu/var/samba/locks/gencache.tdb (zroot/ROOT/default) smbd 39610 root 24u unix 0xfffff817f0cf4000 0t0 ->(none) smbd 39610 root 25u VREG 1441510162,2280381956 425984 114755 /liu/var/samba/state/group_mapping.tdb (zroot/ROOT/default) smbd 39610 root 26u VREG 1441510162,2280381956 421888 114758 /liu/var/samba/state/account_policy.tdb (zroot/ROOT/default) smbd 39610 root 27u VREG 1441510162,2280381956 421888 210015 /liu/etc/samba/private/passdb.tdb (zroot/ROOT/default) smbd 39610 root 28ur VREG 1441510162,2280381956 11005952 114759 /liu/var/samba/locks/smbXsrv_open_global.tdb (zroot/ROOT/default) smbd 39610 root 29u IPv4 0xfffffe169f59f518 0t0 TCP filur02.it.liu.se:microsoft-ds->*:* (LISTEN) smbd 39610 root 30u IPv4 0xfffffe150cded518 0t0 TCP filur02.it.liu.se:netbios-ssn->*:* (LISTEN) smbd 39610 root 31u IPv6 0xfffffe16e95d50e0 0t0 TCP filur02.it.liu.se:microsoft-ds (LISTEN) smbd 39610 root 32u IPv6 0xfffffe13a40300e0 0t0 TCP filur02.it.liu.se:netbios-ssn (LISTEN) smbd 39610 root 33u unix 0t0 can't read unpcb at 0xfffff84f455c3300 smbd 39610 root 35ur VREG 1441510162,2280381956 12288 114778 /liu/var/samba/locks/smbd_cleanupd.tdb (zroot/ROOT/default) As soon as I killed that one the other blocked smbd's continued.
Hmm.. # fgrep locking.tdb lsof.all | egrep 'ur|uw' smbd 39610 root 20ur VREG 1441510162,2280381956 122077184 357392 /liu/var/samba/locks/locking.tdb (zroot/ROOT/default) smbd 73641 micbe11 20uw VREG 1441510162,2280381956 122077184 357392 /liu/var/samba/locks/locking.tdb (zroot/ROOT/default) Process 39610 has a read-lock (this was the process I killed that made the other blocked processes continue). I wonder what 73641 is doing though... Strange. (gdb) #0 0x000019ec86180e18 in _fcntl () from /lib/libc.so.7 #1 0x000019ec9436faf6 in ?? () from /lib/libthr.so.3 #2 0x000019ec8617d977 in fcntl () from /lib/libc.so.7 #3 0x000019ec88237e94 in fcntl_lock (waitflag=true, len=1, off=11236, rw=1, tdb=0x19ecc1c0fb60) at ../../lib/tdb/common/lock.c:58 #4 tdb_brlock (tdb=tdb@entry=0x19ecc1c0fb60, rw_type=rw_type@entry=1, offset=offset@entry=11236, len=len@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:200 #5 0x000019ec8823846b in tdb_nest_lock (tdb=tdb@entry=0x19ecc1c0fb60, offset=offset@entry=11236, ltype=ltype@entry=1, flags=flags@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:390 #6 0x000019ec88238696 in tdb_lock_list (tdb=tdb@entry=0x19ecc1c0fb60, list=list@entry=2767, ltype=ltype@entry=1, waitflag=waitflag@entry=TDB_LOCK_WAIT) at ../../lib/tdb/common/lock.c:482 #7 0x000019ec8823875d in tdb_lock (tdb=tdb@entry=0x19ecc1c0fb60, list=2767, ltype=ltype@entry=1) at ../../lib/tdb/common/lock.c:500 #8 0x000019ec88235ba9 in tdb_find_lock_hash (tdb=tdb@entry=0x19ecc1c0fb60, key=..., hash=2144392790, locktype=locktype@entry=1, rec=rec@entry=0x19ec75c82890) at ../../lib/tdb/common/tdb.c:165 #9 0x000019ec88235d8c in tdb_parse_record (tdb=0x19ecc1c0fb60, key=..., parser=parser@entry=0x19ec88d7ffa2 <db_tdb_parser>, private_data=private_data@entry=0x19ec75c828e0) at ../../lib/tdb/common/tdb.c:329 #10 0x000019ec88d80307 in db_tdb_parse (db=<optimized out>, key=..., parser=0x19ec77aacc0d <brl_get_locks_readonly_parser>, private_data=0x19ec75c82960) at ../../lib/dbwrap/dbwrap_tdb.c:273 #11 0x000019ec88d7da81 in dbwrap_parse_record (db=<optimized out>, key=..., parser=parser@entry=0x19ec77aacc0d <brl_get_locks_readonly_parser>, private_data=private_data@entry=0x19ec75c82960) at ../../lib/dbwrap/dbwrap.c:425 #12 0x000019ec77aaec30 in brl_get_locks_readonly (fsp=fsp@entry=0x19ecc1c1faa0) at ../../source3/locking/brlock.c:1887 #13 0x000019ec77af11f2 in file_has_brlocks (fsp=0x19ecc1c1faa0) at ../../source3/smbd/open.c:2277 #14 delay_for_oplock (pgranted=<synthetic pointer>, poplock_type=<synthetic pointer>, first_open_attempt=true, create_disposition=<optimized out>, have_sharing_violation=false, lck=0x19ec75c83280, lease=0x0, oplock_request=0, fsp=0x19ecc1c1faa0) at ../../source3/smbd/open.c:2794 #15 handle_share_mode_lease (pgranted=<synthetic pointer>, poplock_type=<synthetic pointer>, first_open_attempt=true, lease=0x0, oplock_request=0, share_access=1, access_mask=1179785, create_disposition=<optimized out>, lck=0x19ec75c83280, fsp=0x19ecc1c1faa0) at ../../source3/smbd/open.c:2910 #16 check_and_store_share_mode (first_open_attempt=true, lease=0x0, oplock_request=0, share_access=1, access_mask=1179785, create_disposition=<optimized out>, lck=0x19ec75c83280, req=0x19ecc1cbbfa0, fsp=0x19ecc1c1faa0) at ../../source3/smbd/open.c:3378 #17 open_ntcreate_lock_add_entry (lck=0x19ec75c83280, keep_locked=0x19ec75c83008, private_data=0x19ec75c83260) at ../../source3/smbd/open.c:3723 #18 0x000019ec77ab4a21 in share_mode_entry_prepare_lock_fn (glck=0x19ec75c82b60, cb_private=0x19ec75c82fc0) at ../../source3/locking/share_mode_lock.c:2978 #19 0x000019ec7afb41dc in g_lock_lock_cb_run_and_store (cb_state=cb_state@entry=0x19ec75c82b60) at ../../source3/lib/g_lock.c:597 #20 0x000019ec7afb5552 in g_lock_lock_simple_fn (rec=0x19ec75c82d80, value=..., private_data=0x19ec75c82f30) at ../../source3/lib/g_lock.c:1212 #21 0x000019ec7afb283d in dbwrap_watched_do_locked_fn (backend_rec=backend_rec@entry=0x19ec75c82c80, backend_value=..., private_data=private_data@entry=0x19ec75c82d50) at ../../source3/lib/dbwrap/dbwrap_watch.c:458 #22 0x000019ec88d8062e in db_tdb_do_locked (db=<optimized out>, key=..., fn=0x19ec7afb27fd <dbwrap_watched_do_locked_fn>, private_data=0x19ec75c82d50) at ../../lib/dbwrap/dbwrap_tdb.c:208 #23 0x000019ec88d7ddbe in dbwrap_do_locked (db=0x19ecc1c7eb60, key=..., fn=fn@entry=0x19ec7afb27fd <dbwrap_watched_do_locked_fn>, private_data=private_data@entry=0x19ec75c82d50) at ../../lib/dbwrap/dbwrap.c:553 #24 0x000019ec7afb1acb in dbwrap_watched_do_locked (db=0x19ecc1c7ec60, key=..., fn=0x19ec7afb5415 <g_lock_lock_simple_fn>, private_data=0x19ec75c82f30) at ../../source3/lib/dbwrap/dbwrap_watch.c:480 #25 0x000019ec88d7ddbe in dbwrap_do_locked (db=0x19ecc1c7ec60, key=..., fn=fn@entry=0x19ec7afb5415 <g_lock_lock_simple_fn>, private_data=private_data@entry=0x19ec75c82f30) at ../../lib/dbwrap/dbwrap.c:553 #26 0x000019ec7afb6b7d in g_lock_lock (ctx=0x19ecc1c86960, key=..., type=type@entry=G_LOCK_WRITE, timeout=..., cb_fn=<optimized out>, cb_private=<optimized out>) at ../../source3/lib/g_lock.c:1267 #27 0x000019ec77ab941c in _share_mode_entry_prepare_lock (prepare_state=prepare_state@entry=0x19ec75c83260, id=..., servicepath=<optimized out>, smb_fname=smb_fname@entry=0x19ecc1c28880, old_write_time=old_write_time@entry=0x19ec75c83160, fn=fn@entry=0x19ec77af0a33 <open_ntcreate_lock_add_entry>, private_data=private_data@entry=0x19ec75c83260, location=<optimized out>) at ../../source3/locking/share_mode_lock.c:3033 #28 0x000019ec77af4ed2 in open_file_ntcreate (fsp=<optimized out>, pinfo=0x19ec75c83138, smb_fname_atname=<optimized out>, parent_dir_fname=<optimized out>, private_flags=<optimized out>, lease=<optimized out>, oplock_request=<optimized out>, new_dos_attributes=<optimized out>, create_options=<optimized out>, create_disposition=<optimized out>, share_access=<optimized out>, access_mask=<optimized out>, req=<optimized out>, conn=0x19ecc1c1c9e0) at ../../source3/smbd/open.c:4302 #29 create_file_unixpath (conn=conn@entry=0x19ecc1c1c9e0, req=req@entry=0x19ecc1cbbfa0, dirfsp=<optimized out>, dirfsp@entry=0x19ecc1c208a0, smb_fname=smb_fname@entry=0x19ecc1c28580, access_mask=access_mask@entry=1179785, share_access=share_access@entry=1, create_disposition=<optimized out>, create_options=<optimized out>, file_attributes=<optimized out>, oplock_request=<optimized out>, lease=<optimized out>, allocation_size=<optimized out>, private_flags=<optimized out>, sd=<optimized out>, ea_list=<optimized out>, result=<optimized out>, pinfo=<optimized out>) at ../../source3/smbd/open.c:6299 #30 0x000019ec77af6cd2 in create_file_default (conn=<optimized out>, req=0x19ecc1cbbfa0, dirfsp=<optimized out>, smb_fname=0x19ecc1c28580, access_mask=<optimized out>, share_access=<optimized out>, create_disposition=<optimized out>, create_options=<optimized out>, file_attributes=<optimized out>, oplock_request=<optimized out>, lease=<optimized out>, allocation_size=<optimized out>, private_flags=<optimized out>, sd=<optimized out>, ea_list=<optimized out>, result=<optimized out>, pinfo=<optimized out>, in_context_blobs=<optimized out>,