I've been integrating smbc_notify support into gvfs and doing this requires using multiple smb contexts. This fails in a couple of ways which the attached test program demonstrates. Firstly, simultaneous initialization of contexts fails: $ gdb testnotify (gdb) r Starting program: /home/ross/src/samba/samba/examples/libsmbclient/testnotify [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". talloc: access after free error - first free may be at ../source3/param/loadparm.c:282 Bad talloc magic value - access after free [New Thread 0x7fffe925d700 (LWP 16997)] [New Thread 0x7fffe9a5e700 (LWP 16996)] Program received signal SIGABRT, Aborted. [Switching to Thread 0x7fffe925d700 (LWP 16997)] 0x00007ffff7621528 in raise () from /usr/lib/libc.so.6 (gdb) bt #0 0x00007ffff7621528 in raise () from /usr/lib/libc.so.6 #1 0x00007ffff762293a in abort () from /usr/lib/libc.so.6 #2 0x00007ffff436ed7c in ?? () from /usr/lib/libtalloc.so.2 #3 0x00007ffff4374cfa in ?? () from /usr/lib/libtalloc.so.2 #4 0x00007ffff436da7b in _talloc_free () from /usr/lib/libtalloc.so.2 #5 0x00007ffff5f06846 in free_global_parameters () at ../source3/param/loadparm.c:446 #6 0x00007ffff5f06d13 in init_globals (lp_ctx=0x7fffdc000aa0, reinit_globals=true) at ../source3/param/loadparm.c:533 #7 0x00007ffff5f14788 in lp_load_ex (pszFname=0x7fffdc000a10 "/home/ross/.smb/smb.conf", global_only=true, save_defaults=false, add_ipc=false, reinit_globals=true, allow_include_registry=true, load_all_shares=false) at ../source3/param/loadparm.c:3655 #8 0x00007ffff5f14c98 in lp_load (pszFname=0x7fffdc000a10 "/home/ross/.smb/smb.conf", global_only=true, save_defaults=false, add_ipc=false, reinit_globals=true) at ../source3/param/loadparm.c:3793 #9 0x00007ffff5f14d0a in lp_load_global (file_name=0x7fffdc000a10 "/home/ross/.smb/smb.conf") at ../source3/param/loadparm.c:3818 #10 0x00007ffff5f14d5d in lp_load_client (file_name=0x7fffdc000a10 "/home/ross/.smb/smb.conf") at ../source3/param/loadparm.c:3846 #11 0x00007ffff7bb7666 in SMBC_module_init (punused=0x0) at ../source3/libsmb/libsmb_context.c:57 #12 0x00007ffff7bb7946 in smbc_new_context () at ../source3/libsmb/libsmb_context.c:142 #13 0x0000000000400bba in thread_fn (unused=0x0) at testnotify.c:58 #14 0x00007ffff7997354 in start_thread () from /usr/lib/libpthread.so.0 #15 0x00007ffff76d5bfd in clone () from /usr/lib/libc.so.6 It works if I put a 5 second sleep between the creation of the two threads. Secondly, with two threads waiting in smbc_notify, after receiving about 4 events: 1 l2: 3 num_setup=4, max_setup=0, param_total=0, this_param=0, max_param=1000, data_total=0, this_data=0, max_data=0, param_offset=82, param_pad=1, param_disp=0, data_offset=84, data_pad=2, data_disp=0 Freed frame ../libcli/smb/smbXcli_base.c:1710, expected ../libcli/smb/smbXcli_base.c:1710. PANIC (pid 17017): Frame not freed in order. 1 l3: 1 num_setup=4, max_setup=0, param_total=0, this_param=0, max_param=1000, data_total=0, this_data=0, max_data=0, param_offset=82, param_pad=1, param_disp=0, data_offset=84, data_pad=2, data_disp=0 BACKTRACE: 25 stack frames: #0 /home/ross/blead/gg/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7ff343346560] #1 /home/ross/blead/gg/lib/libsmbconf.so.0(smb_panic_s3+0x6d) [0x7ff3433463b1] #2 /home/ross/blead/gg/lib/libsamba-util.so.0(smb_panic+0x28) [0x7ff343c842e4] #3 /home/ross/blead/gg/lib/libsamba-util.so.0(+0xef33) [0x7ff343c6ef33] #4 /usr/lib/libtalloc.so.2(_talloc_free+0x718) [0x7ff3417d60a8] #5 /home/ross/blead/gg/lib/private/libcli-smb-common-samba4.so(+0xf179) [0x7ff342943179] #6 /home/ross/blead/gg/lib/private/libtevent.so.0(_tevent_req_notify_callback+0x6a) [0x7ff3435af0ae] #7 /home/ross/blead/gg/lib/private/libtevent.so.0(+0x6183) [0x7ff3435af183] #8 /home/ross/blead/gg/lib/private/libtevent.so.0(_tevent_req_done+0x25) [0x7ff3435af1ab] #9 /home/ross/blead/gg/lib/private/libsmb-transport-samba4.so(+0x2d12) [0x7ff33f15fd12] #10 /home/ross/blead/gg/lib/private/libtevent.so.0(_tevent_req_notify_callback+0x6a) [0x7ff3435af0ae] #11 /home/ross/blead/gg/lib/private/libtevent.so.0(+0x6183) [0x7ff3435af183] #12 /home/ross/blead/gg/lib/private/libtevent.so.0(_tevent_req_done+0x25) [0x7ff3435af1ab] #13 /home/ross/blead/gg/lib/private/libsmb-transport-samba4.so(+0x255b) [0x7ff33f15f55b] #14 /home/ross/blead/gg/lib/private/libtevent.so.0(+0xcf34) [0x7ff3435b5f34] #15 /home/ross/blead/gg/lib/private/libtevent.so.0(+0xd553) [0x7ff3435b6553] #16 /home/ross/blead/gg/lib/private/libtevent.so.0(+0xa43e) [0x7ff3435b343e] #17 /home/ross/blead/gg/lib/private/libtevent.so.0(_tevent_loop_once+0xf4) [0x7ff3435ad5b7] #18 /home/ross/blead/gg/lib/private/libtevent.so.0(tevent_req_poll+0x25) [0x7ff3435af43f] #19 /home/ross/blead/gg/lib/libtevent-util.so.0(tevent_req_poll_ntstatus+0x27) [0x7ff3448542c9] #20 /home/ross/blead/gg/lib/libsmbclient.so.0(+0x127d2) [0x7ff3450277d2] #21 /home/ross/blead/gg/lib/libsmbclient.so.0(+0x12c02) [0x7ff345027c02] #22 ./testnotify() [0x400ccb] #23 /usr/lib/libpthread.so.0(+0x7354) [0x7ff344dff354] #24 /usr/lib/libc.so.6(clone+0x6d) [0x7ff344b3dbfd] smb_panic(): calling panic action [/bin/sleep 999999999] 1 l3: 3 num_setup=4, max_setup=0, param_total=0, this_param=0, max_param=1000, data_total=0, this_data=0, max_data=0, param_offset=82, param_pad=1, param_disp=0, data_offset=84, data_pad=2, data_disp=0 0
Created attachment 11286 [details] tet program
Can you try to use smbc_thread_posix() before anything else?
It doesn't compile when I try to use that function: $ gcc -Wall -O0 -g -o testnotify testnotify.c `pkg-config --cflags --libs smbclient` -pthread /tmp/ccnncbJ7.o: In function `main': /home/ross/src/samba/samba/examples/libsmbclient/testnotify.c:89: undefined reference to `smbc_thread_posix' collect2: error: ld returned 1 exit status It's not exported by the shared library: $ nm -D /usr/lib/libsmbclient.so | grep thread U smb_thread_once Neither libsmb_thread_posix.c nor libsmb_thread_impl.c are mentioned in any build files in samba: $ fgrep -r libsmb_thread_posix Binary file .git/index matches $ fgrep -r libsmb_thread_impl Binary file .git/index matches
Created attachment 11303 [details] Patch for master Can you test this patch? We should provide smbc_thread_posix() now.
Well it builds now but gets a segfault immediately: (gdb) bt full #0 0x00007ffff7bceddd in smb_get_tls_pthread (pkey=0x0, location=0x7ffff681bfa0 "../lib/util/talloc_stack.c:134") at ../source3/libsmb/libsmb_thread_posix.c:37 No locals. #1 0x00007ffff6806044 in talloc_stackframe_internal (location=0x7ffff7bcf0d0 "../source3/libsmb/libsmb_context.c:139", poolsize=0) at ../lib/util/talloc_stack.c:133 tmp = 0x0 top = 0x0 ts = 0x0 #2 0x00007ffff68061dd in _talloc_stackframe (location=0x7ffff7bcf0d0 "../source3/libsmb/libsmb_context.c:139") at ../lib/util/talloc_stack.c:179 No locals. #3 0x00007ffff7bb6db1 in smbc_new_context () at ../source3/libsmb/libsmb_context.c:139 context = 0x7ffff7ffe130 frame = 0x7fffe9a5d700 #4 0x0000000000400c7a in thread_fn (unused=0x0) at testnotify.c:58 path = "smb://192.168.1.48/Public", '\000' <repeats 2022 times> smbc_notify = 0x7fffe9a5cf30 smbc_opendir = 0x7fffe9a5d700 smb_context = 0x7ffff7ff7528 dir = 0x7fffe9a5d700 count = 1000 #5 0x00007ffff7996354 in start_thread () from /usr/lib/libpthread.so.0 No symbol table info available. #6 0x00007ffff76d4bfd in clone () from /usr/lib/libc.so.6 No symbol table info available. It looks like global_ts hasn't been initialized in talloc_stack.c
It looks like there're a couple of problems here: talloc_*() calls SMB_THREAD_GET_TLS(global_ts) and expects it to return NULL if it has not been initialized. However, this ends up calling pthread_getspecific() which segfaults if it hasn't been initialized yet. smbc_new_context calls SMB_THREAD_ONCE to run SMBC_module_init. This would eventually end up calling SMB_THREAD_ONCE to run talloc_stackframe_init. However, a global lock is used for this (once_mutex) and since the locks are not recursive, it deadlocks.
(In reply to Ross Lagerwall from comment #6) I'll take a look at this. It used to work (I tested it :-) but hasn't been used in anger for a long time. Might need some care & attention. Jeremy.
Arg. Got this code to work, but running valgrind --tool=helgrind or valgrind --tool=drd on libsmbclient is a horror show.. I'm going to have to do a lot of work to properly make this thread-safe :-(.
Had a long chat with Volker, something like the BKL is going to be needed for this.
Created attachment 11412 [details] Includes Metze's patch. This gets us further along, and I'm planning to add a global libsmbclient mutex to get us to a safe mt space with libsmbclient. Jeremy.
As I understand, this is a blocker to solving https://bugzilla.gnome.org/show_bug.cgi?id=776339 , which in turn results in inferior throughput during transfers to/from dynamically-mounted smb shares (affects GIO->GVFS and KIO, and therefore nautilus, pcmanfm, dolphin ...). Is a thread safe context implementation still being considered/investigated/in progress?
I have it on the list of Summer of Code projects for Samba. Hoping this gets picked up by an eager student.
Hi, Is this bug fixed or do we have work around to use smbcontext in multithreading environment. I don't see smbc_thread_posix() in any version of samba. What version of samba it was included . I can see this in source code but don't find symbols included.
This bug is not fixed, and there is no easy fix for this right now, sorry.
Would any of the experienced developers kindly provide guidance to potential new contributors, on how to tackle this particular bug? What exactly needs to be thread safe - both on an abstract level, and as concrete parts of the existing code? What are the necessary steps to reproduce the current issue/crash - and, in turn, to validate potential patches? Judging from the proposal regarding a student developer, it appears that new contributors could indeed stand a chance. Yet, it seems like nobody picked this up, to date. Most of the information in this bug is rather dated by now; did the challenges to overcome change in any way? Would something like the BKL (Big Kernel Lock?) still be required, or did more fine-grained ways of locking emerge in the meantime? Just to re-iterate on the scope: AFAIK (which could be wrong), this bug still indirectly affects countless Linux desktop users who work with SMB shares, since it is a blocker to GVFS bug https://gitlab.gnome.org/GNOME/gvfs/-/issues/292. Also, there should be an equivalent bug in KIO (KDE). I'd gladly provide any help I can here. Unfortunately, I'm inexperienced in C - but I could for sure help with testing stuff.
(In reply to voidcastr from comment #15) A BKL (aka global lock) around use of libsmbclient inside gvfs should allow it to be used in a pthread.