Bug 15958 - pthreadpool_tevent has race conditions accessing both pthreadpool_tevent.jobs list and pthreadpool_tevent.glue_list
Summary: pthreadpool_tevent has race conditions accessing both pthreadpool_tevent.job...
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Noel Power
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-11-20 16:03 UTC by Noel Power
Modified: 2025-11-20 16:40 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Noel Power 2025-11-20 16:03:50 UTC
concurrent threads executing jobs created by pthreadpool_tevent_job_send and processed by pthreadpool have race conditions.

A customer core dump triggered investigation into this area as the core dump showed corrupted glue_list (at time of abort) which was overwritten by valid data by the time the core had been generated. At the time of the core 2 pthreadpool_tevent jobs were being processed vfs_pwrite_do and vfs_fsync_do


    thread 1 (coring thread, after completing vfs_pwrite_do and calling
    pthreadpool_tevent_job_signal to indicate job event completion
    
    [...]
    
     369 #ifdef HAVE_PTHREAD
     370         for (g = state->pool->glue_list; g != NULL; g = g->next) {
     371                 if (g->ev == state->ev) {
     372                         tctx = g->tctx;
     373                         break;
     374                 }
     375         }
     376
     377         if (tctx == NULL) {
    *378                 abort();
     379         }
     380 #endif
    
     #8  0x00007f4ba8a53cdb in raise () from /lib64/libc.so.6
     #9  0x00007f4ba8a55395 in abort () from /lib64/libc.so.6
     #10 0x00007f4ba96dbc44 in pthreadpool_tevent_job_signal (jobid=<optimized out>, job_fn=<optimized out>, job_private_data=<optimized out>, private_data=<optimized out>) at ../../lib/pthreadpool/pthreadpool_tevent.c:378
     #11 0x00007f4ba96dabaf in pthreadpool_server (arg=0x55b2c234ffe0) at ../../lib/pthreadpool/pthreadpool.c:657
     #12 0x00007f4ba97bf6da in start_thread () from /lib64/libpthread.so.0
     #13 0x00007f4ba8b2153f in clone () from /lib64/libc.so.6
    
    Note: stack tells us we aborted at 378 above meaning tctx is NULL, but
    it shouldn't be (obviously it was at the time of the crash).
    
    If we examine the glue_list and and state->ev (we have to go up a frame
    to figure these out because these variables are optimized at this point
    in the backtrace)
    
    (gdb) up
    
    // Get the state->ev from the struct pthreadpool_tevent_job_state passed
    // to pool->signal_fn
    (gdb) print ((struct pthreadpool_tevent_job_state *)job.private_data)->ev
    $3 = (struct tevent_context *) 0x55b2c233d2e0
    (gdb) print ((struct pthreadpool_tevent_job_state *)job.private_data)->pool
    $4 = (struct pthreadpool_tevent *) 0x55b2c234ffc0
    // Get 'ev' from the first 'g' to iterate (from pool->glue->list)
    (gdb) print ((struct pthreadpool_tevent_job_state *)job.private_data)->pool->glue_list->ev
    $5 = (struct tevent_context *) 0x55b2c2558070
    // its not a match, iterate to the next
    (gdb) print ((struct pthreadpool_tevent_job_state *)job.private_data)->pool->glue_list->next->ev
    $6 = (struct tevent_context *) 0x55b2c233d2e0
    // bingo, this is a match (but doesn't correlate to the core we see!!!!)
    
    thread 2 (awaiting job to complete in worker thread)
    
    (gdb) where
     #0  0x00007f4ba8b2192f in epoll_wait () from /lib64/libc.so.6
     #1  0x00007f4ba8c0de82 in ?? () from /usr/lib64/libtevent.so.0
     #2  0x00007f4ba8c0c0a7 in ?? () from /usr/lib64/libtevent.so.0
     #3  0x00007f4ba8c0706d in _tevent_loop_once () from /usr/lib64/libtevent.so.0
     #4  0x00007f4ba8c08af3 in tevent_req_poll () from /usr/lib64/libtevent.so.0
     #5  0x00007f4ba9ce3d5b in smb_vfs_fsync_sync (fsp=fsp@entry=0x55b2c23bada0) at ../../source3/smbd/vfs.c:2062
     #6  0x00007f4ba9c7e248 in sync_file (conn=conn@entry=0x55b2c237f8a0, fsp=fsp@entry=0x55b2c23bada0, write_through=write_through@entry=true) at ../../source3/smbd/fileio.c:315
     #7  0x00007f4ba9ca784e in reply_flush (req=0x55b2c2557ee0) at ../../source3/smbd/reply.c:5398
    
    thread 3 (worker thread doing the fsync)
    
    (gdb) where
     #0  0x00007f4ba97ca847 in fsync () from /lib64/libpthread.so.0
     #1  0x00007f4ba9dbe6d4 in vfs_fsync_do (private_data=<optimized out>) at ../../source3/modules/vfs_default.c:1128
     #2  0x00007f4ba96dab97 in pthreadpool_server (arg=0x55b2c234ffe0) at ../../lib/pthreadpool/pthreadpool.c:655
     #3  0x00007f4ba97bf6da in start_thread () from /lib64/libpthread.so.0
     #4  0x00007f4ba8b2153f in clone () from /lib64/libc.so.6
    
    clearly the glue_list is being concurrently modified and read, the only
    2 functions that can be involved are
    
        pthreadpool_tevent_register_ev &
        pthreadpool_tevent_job_signal & additionally
        pthreadpool_tevent_destructor triggeredd by deleting the
        registered event context (which for smb_vfs_fsync_sync is
        after every call)
    
    these function can run concurrently in different threads
    (both reading and writing) and the glue_list is unprotected
    from such concurrent access.