On a specific system (x86_64 multiple cores), running a file server based on samba 4.3.6 with some vendor patches, smbd often hangs while loading. Getting a stack trace shows the following: #0 0x00007fb98b4ef69e in waitpid () from rootfs/lib64/libpthread.so.0 #1 0x00007fb98556d4c5 in tdb_runtime_check_for_robust_mutexes () at ../lib/tdb/common/mutex.c:890 #2 0x00007fb980926163 in tdb_wrap_open (mem_ctx=0x0, name=0x958610 "/var/vol/12/.ctera/samba/lock/gencache_notrans.tdb", hash_size=0, tdb_flags=6337, open_flags=66, mode=420) at ../lib/tdb_wrap/tdb_wrap.c:151 #3 0x00007fb988c2ba83 in gencache_init () at ../source3/lib/gencache.c:126 #4 0x00007fb988c2c686 in gencache_parse (keystr=0x7fff135bcfe0 "IDMAP/GID2SID/4", parser=0x7fb988c32ed6 <idmap_cache_xid2sid_parser>, private_data=0x7fff135bcfc0) at ../source3/lib/gencache.c:489 #5 0x00007fb988c330e8 in idmap_cache_find_gid2sid (gid=4, sid=0x7fff135bd140, expired=0x7fff135bd11f) at ../source3/lib/idmap_cache.c:270 #6 0x00007fb9894fc574 in gid_to_sid (psid=0x7fff135bd140, gid=4) at ../source3/passdb/lookup_sid.c:1267 #7 0x00007fb988e7a4f6 in add_local_groups (result=0x9579b0, is_guest=true) at ../source3/auth/token_util.c:470 #8 0x00007fb988e7a622 in finalize_local_nt_token (result=0x9579b0, is_guest=true) at ../source3/auth/token_util.c:495 #9 0x00007fb988e79f4b in create_local_nt_token_from_info3 (mem_ctx=0x957810, is_guest=true, info3=0x957e90, extra=0x957d20, ntok=0x957810) at ../source3/auth/token_util.c:314 #10 0x00007fb988e84f9f in create_local_token (mem_ctx=0x955240, server_info=0x957cd0, session_key=0x0, smb_username=0x958020 "nobody", session_info_out=0x7fb989098058) at ../source3/auth/auth_util.c:555 #11 0x00007fb988e85a08 in make_new_session_info_guest (session_info=0x7fb989098058, server_info=0x7fb989098060) at ../source3/auth/auth_util.c:831 #12 0x00007fb988e867e3 in init_guest_info () at ../source3/auth/auth_util.c:1128 #13 0x000000000040ac9b in main (argc=5, argv=0x7fff135bdb78) at ../source3/smbd/server.c:1518 (gdb) info thread * 1 Thread 1779 0x00007fb98b4ef69e in waitpid () from rootfs/lib64/libpthread.so.0 When this happening there are 2 smbd processes - the main one with this stack trace, and notifyd. Close examination of the code shows there's a race condition between a signal handler and the main thread code.
I'm experiencing same issue using Firefox with WINS name resolution enabled in nsswitch.conf. While reporting details on Mozilla bug tracker [1], it became evident the issue is related to libtdb. Backtrace: (gdb) bt #0 0x00007ffff6d40576 in sigsuspend () from /lib64/libc.so.6 #1 0x00007fffdd1967c9 in tdb_runtime_check_for_robust_mutexes () from /usr/lib64/libtdb.so.1 #2 0x00007fffddeecfc5 in tdb_wrap_open () from /usr/lib64/samba/libtdb-wrap-samba4.so #3 0x00007fffe11525f0 in ?? () from /usr/lib64/libsmbconf.so.0 #4 0x00007fffe1152975 in gencache_parse () from /usr/lib64/libsmbconf.so.0 #5 0x00007fffe11531a2 in gencache_get_data_blob () from /usr/lib64/libsmbconf.so.0 #6 0x00007fffe1153249 in gencache_get () from /usr/lib64/libsmbconf.so.0 #7 0x00007fffe114e65d in wins_srv_is_dead () from /usr/lib64/libsmbconf.so.0 #8 0x00007fffe0d073ee in resolve_wins_send () from /usr/lib64/samba/libgse-samba4.so #9 0x00007fffe0d07831 in resolve_wins () from /usr/lib64/samba/libgse- samba4.so #10 0x00007fffe15c3014 in _nss_wins_gethostbyname_r () from /usr/lib64/libnss_wins.so.2 #11 0x00007ffff6e06062 in gethostbyname_r () from /lib64/libc.so.6 #12 0x00007fffe6dd0825 in PR_GetHostByName () from /usr/lib64/libnspr4.so #13 0x00007fffe9b9ab5a in ?? () from /usr/lib64/firefox/libxul.so #14 0x00007fffe9b9b547 in ?? () from /usr/lib64/firefox/libxul.so #15 0x00007fffe9b9b5dd in ?? () from /usr/lib64/firefox/libxul.so #16 0x00007fffe9b9b844 in ?? () from /usr/lib64/firefox/libxul.so #17 0x00007fffe9b9b8dc in ?? () from /usr/lib64/firefox/libxul.so #18 0x00007fffe9ba11bb in ?? () from /usr/lib64/firefox/libxul.so #19 0x00007fffe9ba1f61 in ?? () from /usr/lib64/firefox/libxul.so #20 0x00007fffe9ba21f0 in XRE_main () from /usr/lib64/firefox/libxul.so #21 0x0000000000405868 in ?? () #22 0x0000000000404f32 in ?? () #23 0x00007ffff6d2d670 in __libc_start_main () from /lib64/libc.so.6 #24 0x00000000004051a9 in _start () System details: OS: Gentoo Kernel: Linux 4.9.3 libtdb: 1.3.12 The issue is almost 100% reproducible in my environment. Feel free to request supplementary details. Thanks. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1308997 Garri
The nss_wins problem should be fixed as part of bug 11563, which was fixed with Samba 4.2.6. Since that bugfix, we don't call directly into gencache from libnss_wins. Are you sure that you are on any recent Samba version?
(In reply to Volker Lendecke from comment #2) >Are you sure that you are on any recent Samba version? Thank you Volker. I thought I used currently supported version of Samba. In fact it was EOL 4.2.14. Other branches are masked by default in Gentoo currently. I've installed current release 4.5.3 and I can no longer reproduce the issue.
*** Bug 12593 has been marked as a duplicate of this bug. ***
This should be fixed with tdb: version 1.3.9