I am seeing a hang when /usr/lib/gvfs/gvfsd-smb-browse is started. I reported in Gnome's bug tracker and the Gnome gvfs developer suggested it is a TDB bug so I am reporting it here. The following is the backtrace. Thread 5 (Thread 0x7f2ea5037700 (LWP 1054)): #0 0x00007f2eb6325506 in sigsuspend () at /usr/lib/libc.so.6 #1 0x00007f2ead463802 in tdb_runtime_check_for_robust_mutexes () at /usr/lib/libtdb.so.1 #2 0x00007f2eade9b05d in tdb_wrap_open () at /usr/lib/samba/libtdb-wrap-samba4.so #3 0x00007f2eb4e49b21 in () at /usr/lib/libsmbconf.so.0 #4 0x00007f2eb4e4a305 in gencache_parse () at /usr/lib/libsmbconf.so.0 #5 0x00007f2eb4e4a862 in gencache_get_data_blob () at /usr/lib/libsmbconf.so.0 #6 0x00007f2eb4e4a90b in gencache_get () at /usr/lib/libsmbconf.so.0 #7 0x00007f2eb422f74a in sitename_fetch () at /usr/lib/samba/libgse-samba4.so #8 0x00007f2eb422d7da in resolve_name () at /usr/lib/samba/libgse-samba4.so #9 0x00007f2eb762a5e2 in () at /usr/lib/libsmbclient.so.0 #10 0x0000000000406ebd in do_mount (backend=<optimized out>, job=0x12b72b0 [GVfsJobMount], mount_spec=<optimized out>, mount_source=<optimized out>, is_automount=<optimized out>) at gvfsbackendsmbbrowse.c:913 #11 0x00007f2eb7407f4a in g_vfs_job_run (job=0x12b72b0 [GVfsJobMount]) at gvfsjob.c:197 #12 0x00007f2eb6920c9e in g_thread_pool_thread_proxy (data=<optimized out>) at gthreadpool.c:307 #13 0x00007f2eb69202a5 in g_thread_proxy (data=0x7f2e98004720) at gthread.c:784 #14 0x00007f2eb6697444 in start_thread () at /usr/lib/libpthread.so.0 ---Type <return> to continue, or q <return> to quit--- #15 0x00007f2eb63d9cff in clone () at /usr/lib/libc.so.6 Thread 4 (Thread 0x7f2ea5838700 (LWP 1053)): #0 0x00007f2eb63d066d in poll () at /usr/lib/libc.so.6 #1 0x00007f2eb68f8736 in g_main_context_poll (priority=<optimized out>, n_fds=1, fds=0x7f2e900010c0, timeout=<optimized out>, context=0x12d19c0) at gmain.c:4228 #2 0x00007f2eb68f8736 in g_main_context_iterate (context=context@entry=0x12d19c0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3924 #3 0x00007f2eb68f884c in g_main_context_iteration (context=0x12d19c0, may_block=1) at gmain.c:3990 #4 0x00007f2ea584055d in () at /usr/lib/gio/modules/libdconfsettings.so #5 0x00007f2eb69202a5 in g_thread_proxy (data=0x12bc230) at gthread.c:784 #6 0x00007f2eb6697444 in start_thread () at /usr/lib/libpthread.so.0 #7 0x00007f2eb63d9cff in clone () at /usr/lib/libc.so.6 Thread 3 (Thread 0x7f2ea6a47700 (LWP 1051)): #0 0x00007f2eb63d066d in poll () at /usr/lib/libc.so.6 #1 0x00007f2eb68f8736 in g_main_context_poll (priority=<optimized out>, n_fds=2, fds=0x7f2e980010c0, timeout=<optimized out>, context=0x12ba240) at gmain.c:4228 #2 0x00007f2eb68f8736 in g_main_context_iterate (context=0x12ba240, block=block---Type <return> to continue, or q <return> to quit--- @entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3924 #3 0x00007f2eb68f8ac2 in g_main_loop_run (loop=0x1291a00) at gmain.c:4125 #4 0x00007f2eb6ee8606 in gdbus_shared_thread_func (user_data=0x12ba210) at gdbusprivate.c:247 #5 0x00007f2eb69202a5 in g_thread_proxy (data=0x12bbca0) at gthread.c:784 #6 0x00007f2eb6697444 in start_thread () at /usr/lib/libpthread.so.0 #7 0x00007f2eb63d9cff in clone () at /usr/lib/libc.so.6 Thread 2 (Thread 0x7f2ea7248700 (LWP 1050)): #0 0x00007f2eb63d066d in poll () at /usr/lib/libc.so.6 #1 0x00007f2eb68f8736 in g_main_context_poll (priority=<optimized out>, n_fds=1, fds=0x7f2ea00008e0, timeout=<optimized out>, context=0x12b9b70) at gmain.c:4228 #2 0x00007f2eb68f8736 in g_main_context_iterate (context=context@entry=0x12b9b70, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3924 #3 0x00007f2eb68f884c in g_main_context_iteration (context=0x12b9b70, may_block=may_block@entry=1) at gmain.c:3990 #4 0x00007f2eb68f8891 in glib_worker_main (data=<optimized out>) at gmain.c:5783 #5 0x00007f2eb69202a5 in g_thread_proxy (data=0x12bbc50) at gthread.c:784 #6 0x00007f2eb6697444 in start_thread () at /usr/lib/libpthread.so.0 #7 0x00007f2eb63d9cff in clone () at /usr/lib/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 1 (Thread 0x7f2eb79817c0 (LWP 1049)): #0 0x00007f2eb63d066d in poll () at /usr/lib/libc.so.6 #1 0x00007f2eb68f8736 in g_main_context_poll (priority=<optimized out>, n_fds=1, fds=0x12935e0, timeout=<optimized out>, context=0x12aed00) at gmain.c:4228 #2 0x00007f2eb68f8736 in g_main_context_iterate (context=0x12aed00, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3924 #3 0x00007f2eb68f8ac2 in g_main_loop_run (loop=0x12932a0) at gmain.c:4125 #4 0x000000000040b7ff in daemon_main (argc=argc@entry=4, argv=argv@entry=0x7ffe54d4ba08, max_job_threads=max_job_threads@entry=1, default_type=default_type@entry=0x40bbab "smb-network", mountable_name=mountable_name@entry=0x40c858 "org.gtk.vfs.mountpoint_smb_browse", first_type_name=first_type_name@entry=0x40bbab "smb-network") at daemon-main.c:398 #5 0x0000000000404ca7 in main (argc=4, argv=0x7ffe54d4ba08) at daemon-main-generic.c:45 Downstream bug report: https://bugzilla.gnome.org/show_bug.cgi?id=778752 Thank you.
*** This bug has been marked as a duplicate of bug 11808 ***
I am still seeing this in tdb 1.3.12 Bug 11808 says it was fixed in 1.3.9
(In reply to Hussam Al-Tayeb from comment #2) Which samba version are you using?
I have Samba 4.5.4 installed.
(In reply to Hussam Al-Tayeb from comment #4) Ok, it might be a regression in the fixes for #11808. Ralph, Uri, can you have a look?
(In reply to Hussam Al-Tayeb from comment #4) Possibly linked against an older installed version of tdb?
(In reply to Ralph Böhme from comment #6) The backtrace shows it's blocking in sigsuspend() so I don't think so. What is the used glibc and kernel version?
`rpm -qv libtdb` please.
(In reply to Ralph Böhme from comment #8) typo, sorry: `rpm -qi libtdb`
tdb build date is Mon 19 Dec 2016 while samba build date is Sun 22 Jan 2017. Linked against glibc 2.24 (now running 2.25) Running linux kernel 4.9.13. I am going to update to samba 4.6.0 and I will report if this continues to happen. Thank you.
(In reply to Ralph Böhme from comment #9) Not running a RPM distribution but: cat /usr/lib/pkgconfig/tdb.pc prefix=/usr exec_prefix=${prefix} libdir=${prefix}/lib includedir=${prefix}/include Name: tdb Description: A trivial database Version: 1.3.12 Libs: -Wl,-rpath,/usr/lib -L${libdir} -ltdb Cflags: -I${includedir} URL: http://tdb.samba.org/ pacman -Ql tdb tdb /usr/ tdb /usr/bin/ tdb /usr/bin/tdbbackup tdb /usr/bin/tdbdump tdb /usr/bin/tdbrestore tdb /usr/bin/tdbtool tdb /usr/include/ tdb /usr/include/tdb.h tdb /usr/lib/ tdb /usr/lib/libtdb.so tdb /usr/lib/libtdb.so.1 tdb /usr/lib/libtdb.so.1.3.12 tdb /usr/lib/pkgconfig/ tdb /usr/lib/pkgconfig/tdb.pc tdb /usr/lib/python2.7/ tdb /usr/lib/python2.7/site-packages/ tdb /usr/lib/python2.7/site-packages/_tdb_text.py tdb /usr/lib/python2.7/site-packages/tdb.so tdb /usr/share/ tdb /usr/share/man/ tdb /usr/share/man/man8/ tdb /usr/share/man/man8/tdbbackup.8.gz tdb /usr/share/man/man8/tdbdump.8.gz tdb /usr/share/man/man8/tdbrestore.8.gz tdb /usr/share/man/man8/tdbtool.8.gz
(In reply to Hussam Al-Tayeb from comment #11) Thanks! Well, metze is probably right about the sigsuspend() in the SBT telling us that you *are* using the version that has the "fix", because the code didn't call sigsuspend() before. But seeing the version info is a little bit more explicit. :)
I wonder whether that check is multi-thread-safe. Ending up hung on sigsuspend probably means that another thread got the signal. According to POSIX (http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_sigmask.html) pthread_sigmask() only affects the calling thread's mask. If we're already thread #5, other threads may have received the signal. According to Linux manpage, pthread_sigmask() is just like sigprocmask (i.e. affecting the whole process).
(In reply to Uri Simchoni from comment #13) Yeah, I was suspecting this as well. Which is kind of a problem, because POSIX leaves it unclear if you can call sigprocmask() in a threaded process to change the signal mask of all threads. Does anyone know? Going to write a test if not.
Threads have distinct signal masks, so pthread_sigmask() must change only the calling thread - it makes no sense otherwise. Basically the application writer needs to set signal masks of threads to control which thread gets what signal, and setting it from a library, with no control over other threads, is problematic. Perhaps gvfsd-smb-browse can initialize tdb before spawning threads? That's what we do on Samba daemons and I really don't see a way around it, except for dropping this test. We also can't have some applications which access gencache assume there are robust mutexes and some assume there isn't. If we discuss changes to tdb - let's move the discussion to the list.
(In reply to Uri Simchoni from comment #15) Fwiw, I tested whether sigprocmask() might change the signal mask of all threads in a multithreaded program and it doesn't. It behaves just like pthread_sigmask().
We might be able to keep the runtime test by using a library constructor. We already use this feature in talloc. I'll try to find time tomorrow to code something up.
FYI - was that (sigprocmask test) on Linux ? There are glibc-ism's around the way syscalls behave in Linux pthread mapped onto system threads (different processes under the covers) that are different in *BSD and Solaris (not that Solaris matters anymore, but *BSD does :-). Cam you test sigprocmask on *BSD also ?
(In reply to Jeremy Allison from comment #18) Yes, that was on Linux. I could test it on FreeBSD, but what would it help if we already know it's unusable as it is on Linux?
Never mind, I was just curious as I suspect *BSD behaves differently here :-).
I proposed a patch on the ML, cf <https://lists.samba.org/archive/samba-technical/2017-March/119314.html>.
(In reply to Ralph Böhme from comment #21) The fix proposed for the next tdb release: https://git.samba.org/?p=metze/samba/wip.git;a=commitdiff;h=be43b65b32e23d467c5aea7f7da9eb8 from this branch: https://git.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-ldb
Fixed in 19b193ebc974efdebdf347143938b5d053e67051, tdb 1.3.13.