Bug 15464 - libnss_winbind causes memory corruption since samba-4.18, impacts sendmail, zabbix, potentially more
Summary: libnss_winbind causes memory corruption since samba-4.18, impacts sendmail, z...
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.18.0
Hardware: All All
: P5 regression with 15 votes (vote)
Target Milestone: ---
Assignee: Jule Anger
QA Contact: Samba QA Contact
URL: https://gitlab.com/samba-team/samba/-...
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-01 05:14 UTC by Krzysztof Olędzki
Modified: 2023-10-16 14:19 UTC (History)
6 users (show)

See Also:


Attachments
Minimalist patch for samba-4.18 to work around the bug (834 bytes, patch)
2023-09-05 23:19 UTC, Krzysztof Olędzki
metze: review-
Details
Revert 642a4452ce5b3333c50e41e54bc6ca779686ecc3 and 7545e2c77b69fc57e436e3ed298fdb68033ce49f (5.13 KB, patch)
2023-09-07 05:30 UTC, Krzysztof Olędzki
no flags Details
b15464-testcase.c (945 bytes, text/plain)
2023-09-07 07:32 UTC, Krzysztof Olędzki
no flags Details
Potential fix (396 bytes, patch)
2023-09-07 07:35 UTC, Krzysztof Olędzki
no flags Details
Patches for v4-19-test (16.89 KB, patch)
2023-09-14 21:02 UTC, Stefan Metzmacher
jra: review+
Details
Patches for v4-18-test (16.89 KB, patch)
2023-09-14 21:03 UTC, Stefan Metzmacher
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Krzysztof Olędzki 2023-09-01 05:14:32 UTC
I'm running samba in "active directory domain controller" role. I have the following /etc/nsswitch.conf setup related to samba:

# grep winbind /etc/nsswitch.conf |grep -v ^#
group:      files winbind
passwd:     files winbind

This is a non-pam system.

After upgrading samba-4.17.10 to samba-4.18 (tested samba-4.18.6 and samba-4.18.5) I noticed that sendmail (8.17.1.9) started crashing:

sendmail[5498]: segfault at 563b95c2ad84 ip 00007f17e02c686a sp 00007ffcecfff970 error 4 in libc.so.6[7f17e0254000+155000]
Code: cc f9 ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 85 ff 0f 84 bf 00 00 00 55 48 8d 77 f0 53 48 83 ec 18 48 8b 1d 8e a5 13 00 <48> 8b 47 f8 64 8b 2b a8 02 75 5b 48 8b 15 1c a5 13 00 64 48 83 3a

Before the crash, sendmail complains that fd 0 is not open:

SYSERR(root): fill_fd: disconnect: fd 0 not open: Bad file descriptor
  1: fl=0x8001, mode=20666: CHR: dev=0/6, ino=9218, nlink=1, u/gid=0/0, size=0
  2: fl=0x8001, mode=20666: CHR: dev=0/6, ino=9218, nlink=1, u/gid=0/0, size=0
  3: fl=0x2, mode=140777: SOCK localhost->[[UNIX: /dev/log]]

Reverting to samba-4.17.10 fixes the problem, same if I remove "winbind" from "passwd".

Strace with samba-4.18:

4405  close(0)                          = 0
4405  openat(AT_FDCWD, "/dev/null", O_RDONLY) = 0
4405  close(0)                          = 0
4405  openat(AT_FDCWD, "/dev/null", O_WRONLY) = 0
4405  dup2(0, 1)                        = 1
4405  dup2(0, 2)                        = 2
4405  close(0)                          = 0
4405  newfstatat(0, "", 0x7ffd3fab3ce0, AT_EMPTY_PATH) = -1 EBADF (Bad file descriptor)


Strace with samba-4.17:
6047  close(0)                          = 0
6047  openat(AT_FDCWD, "/dev/null", O_RDONLY) = 0
6047  close(4)                          = 0
6047  openat(AT_FDCWD, "/dev/null", O_WRONLY) = 4
6047  dup2(4, 1)                        = 1
6047  dup2(4, 2)                        = 2
6047  close(4)                          = 0
6047  newfstatat(0, "", {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}, AT_EMPTY_PATH) = 0
6047  newfstatat(1, "", {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}, AT_EMPTY_PATH) = 0
6047  newfstatat(2, "", {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}, AT_EMPTY_PATH) = 0

As you can see, under 4.18 fd=0 (which sendmail first closes and then re-opens as /dev/null) gets closed soon after it is opened. For samba-4.17, fd=4 gets closed instead.

I believe the code making these syscalls comes from sendmail/main.c, function "disconnect":
 sm_io_reopen(SmFtStdio, SM_TIME_DEFAULT, SM_PATH_DEVNULL, SM_IO_RDONLY, NULL, smioin)
 (...)
 fd = open(SM_PATH_DEVNULL, O_WRONLY, 0666);
 dup2(fd, STDOUT_FILENO);
 dup2(fd, STDERR_FILENO);
 close(fd);

Where "sm_io_reopen" comes from libsm/fopen.c and does:

        if ((ioflags = sm_flags(flags)) == 0)
        {
                (void) sm_io_close(fp, timeout);
                return NULL;
        }
(...)
        (*fp2->f_open)(fp2, info, flags, rpool);

sm_io_close is a wrapper to close(2), same for f_open, both coming from I think libsm/stdio.c.

I have not yet identified what calls close(4) (or close(0)).

I will try to identify the offending commit over the weekend. It seems we only have a limited number changes, assuming https://git.samba.org/?p=samba.git;a=history;f=nsswitch;hb=refs/heads/v4-18-stable is the correct place to look.
Comment 1 Krzysztof Olędzki 2023-09-01 05:32:07 UTC
Also replacing /usr/lib64/libnss_winbind.so.2 with the version from samba-4.17.10 also fixes the problem, as expected.
Comment 2 Krzysztof Olędzki 2023-09-02 21:03:46 UTC
Reverting "nsswitch: leverage TLS if available in favour over global locking" [1] which also requires "nsswitch: avoid calling pthread_getspecific() on an uninitialized key" [2] to be reverted to apply cleanly, fixed the problem for me. 

No more crash and no more warning about "fd 0 not open: Bad file descriptor".

Tested on both x86 and x86-64.


[1] https://git.samba.org/?p=samba.git;a=commitdiff;h=642a4452ce5b3333c50e41e54bc6ca779686ecc3

[2] https://git.samba.org/?p=samba.git;a=commitdiff;h=7545e2c77b69fc57e436e3ed298fdb68033ce49f
Comment 3 Krzysztof Olędzki 2023-09-03 02:27:44 UTC
I also found the following line in the strace output:
writev(2, [{iov_base="free(): invalid next size (fast)", iov_len=32}, {iov_base="\n", iov_len=1}], 2) = 33


This suggests a memory corruption. Indeed, running sendmail under valgrind with the original winbind from samba-4.18.6 produces:

==2054== Invalid read of size 1
==2054==    at 0x4ADB638: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f310 is 32 bytes before a block of size 480 in arena "client"
==2054==
==2054== Invalid write of size 8
==2054==    at 0x4ADB657: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f210 is 32 bytes inside a block of size 256 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x50C6F86: initgroups (initgroups.c:212)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x50C6F21: initgroups (initgroups.c:200)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==
==2054== Invalid write of size 8
==2054==    at 0x4ADB66B: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f290 is 160 bytes inside a block of size 256 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x50C6F86: initgroups (initgroups.c:212)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x50C6F21: initgroups (initgroups.c:200)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==
==2054== Invalid write of size 4
==2054==    at 0x4ADB677: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f310 is 32 bytes before a block of size 480 in arena "client"
==2054==
==2054== Invalid write of size 4
==2054==    at 0x4ADB682: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f150 is 16 bytes before a block of size 69 alloc'd
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x4003911: malloc (rtld-malloc.h:56)
==2054==    by 0x4003911: _dl_exception_create_format (dl-exception.c:157)
==2054==    by 0x400A4C7: _dl_lookup_symbol_x (dl-lookup.c:793)
==2054==    by 0x5145E5C: do_sym (dl-sym.c:146)
==2054==    by 0x507AF03: dlsym_doit (dlsym.c:40)
==2054==    by 0x4001488: _dl_catch_exception (dl-catch.c:237)
==2054==    by 0x40015AE: _dl_catch_error (dl-catch.c:256)
==2054==    by 0x507A906: _dlerror_run (dlerror.c:138)
==2054==    by 0x507AF9B: dlsym_implementation (dlsym.c:54)
==2054==    by 0x507AF9B: dlsym@@GLIBC_2.34 (dlsym.c:68)
==2054==    by 0x55E1DF6: winbind_open_pipe_sock (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E1FD3: winbind_write_sock (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E21C3: winbindd_send_request.part.0 (in /usr/lib64/libnss_winbind.so.2)
==2054==
==2054== Invalid write of size 4
==2054==    at 0x4ADB69E: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f3d0 is 160 bytes inside a block of size 472 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x5069619: _IO_deallocate_file (libioP.h:863)
==2054==    by 0x5069619: fclose@@GLIBC_2.2.5 (iofclose.c:74)
==2054==    by 0x5130E11: _nss_files_initgroups_dyn (files-initgroups.c:126)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x5069F9A: __fopen_internal (iofopen.c:65)
==2054==    by 0x5129BFC: __nss_files_fopen (nss_files_fopen.c:27)
==2054==    by 0x5130BF2: _nss_files_initgroups_dyn (files-initgroups.c:36)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid read of size 8
==2054==    at 0x4ADB6A9: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f350 is 32 bytes inside a block of size 472 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x5069619: _IO_deallocate_file (libioP.h:863)
==2054==    by 0x5069619: fclose@@GLIBC_2.2.5 (iofclose.c:74)
==2054==    by 0x5130E11: _nss_files_initgroups_dyn (files-initgroups.c:126)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x5069F9A: __fopen_internal (iofopen.c:65)
==2054==    by 0x5129BFC: __nss_files_fopen (nss_files_fopen.c:27)
==2054==    by 0x5130BF2: _nss_files_initgroups_dyn (files-initgroups.c:36)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid read of size 8
==2054==    at 0x4ADB6B6: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f410 is 224 bytes inside a block of size 472 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x5069619: _IO_deallocate_file (libioP.h:863)
==2054==    by 0x5069619: fclose@@GLIBC_2.2.5 (iofclose.c:74)
==2054==    by 0x5130E11: _nss_files_initgroups_dyn (files-initgroups.c:126)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x5069F9A: __fopen_internal (iofopen.c:65)
==2054==    by 0x5129BFC: __nss_files_fopen (nss_files_fopen.c:27)
==2054==    by 0x5130BF2: _nss_files_initgroups_dyn (files-initgroups.c:36)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid write of size 8
==2054==    at 0x4ADB6C6: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f350 is 32 bytes inside a block of size 472 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x5069619: _IO_deallocate_file (libioP.h:863)
==2054==    by 0x5069619: fclose@@GLIBC_2.2.5 (iofclose.c:74)
==2054==    by 0x5130E11: _nss_files_initgroups_dyn (files-initgroups.c:126)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x5069F9A: __fopen_internal (iofopen.c:65)
==2054==    by 0x5129BFC: __nss_files_fopen (nss_files_fopen.c:27)
==2054==    by 0x5130BF2: _nss_files_initgroups_dyn (files-initgroups.c:36)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid write of size 8
==2054==    at 0x4ADB6D7: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f410 is 224 bytes inside a block of size 472 free'd
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x5069619: _IO_deallocate_file (libioP.h:863)
==2054==    by 0x5069619: fclose@@GLIBC_2.2.5 (iofclose.c:74)
==2054==    by 0x5130E11: _nss_files_initgroups_dyn (files-initgroups.c:126)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==  Block was alloc'd at
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x5069F9A: __fopen_internal (iofopen.c:65)
==2054==    by 0x5129BFC: __nss_files_fopen (nss_files_fopen.c:27)
==2054==    by 0x5130BF2: _nss_files_initgroups_dyn (files-initgroups.c:36)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid write of size 8
==2054==    at 0x4ADB692: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f1a0 is 64 bytes inside a block of size 69 alloc'd
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x4003911: malloc (rtld-malloc.h:56)
==2054==    by 0x4003911: _dl_exception_create_format (dl-exception.c:157)
==2054==    by 0x400A4C7: _dl_lookup_symbol_x (dl-lookup.c:793)
==2054==    by 0x5145E5C: do_sym (dl-sym.c:146)
==2054==    by 0x507AF03: dlsym_doit (dlsym.c:40)
==2054==    by 0x4001488: _dl_catch_exception (dl-catch.c:237)
==2054==    by 0x40015AE: _dl_catch_error (dl-catch.c:256)
==2054==    by 0x507A906: _dlerror_run (dlerror.c:138)
==2054==    by 0x507AF9B: dlsym_implementation (dlsym.c:54)
==2054==    by 0x507AF9B: dlsym@@GLIBC_2.34 (dlsym.c:68)
==2054==    by 0x55E1DF6: winbind_open_pipe_sock (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E1FD3: winbind_write_sock (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E21C3: winbindd_send_request.part.0 (in /usr/lib64/libnss_winbind.so.2)
==2054==
==2054== Invalid write of size 4
==2054==    at 0x4ADB68A: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x559f11c is 0 bytes after a block of size 12 alloc'd
==2054==    at 0x48407C4: malloc (vg_replace_malloc.c:431)
==2054==    by 0x55E1599: get_wb_thread_ctx (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E1CDC: winbindd_request_response (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x55E3A5F: _nss_winbind_initgroups_dyn (in /usr/lib64/libnss_winbind.so.2)
==2054==    by 0x50C6BEF: internal_getgrouplist (initgroups.c:101)
==2054==    by 0x50C6F4A: initgroups (initgroups.c:205)
==2054==    by 0x17437C: include (in /usr/sbin/sendmail)
==2054==    by 0x11DEC2: forward (in /usr/sbin/sendmail)
==2054==    by 0x176586: recipient (in /usr/sbin/sendmail)
==2054==    by 0x164B31: readqf (in /usr/sbin/sendmail)
==2054==    by 0x16784C: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==
==2054== Invalid free() / delete / delete[] / realloc()
==2054==    at 0x484308B: free (vg_replace_malloc.c:974)
==2054==    by 0x4ADB6B5: err_delete_thread_state (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B258C3: init_thread_stop.part.0 (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B25B70: OPENSSL_thread_stop (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x4B24FBC: OPENSSL_cleanup (in /usr/lib64/libcrypto.so.3)
==2054==    by 0x5034C93: __run_exit_handlers (exit.c:111)
==2054==    by 0x5034DC9: exit (exit.c:141)
==2054==    by 0x119F5B: finis (in /usr/sbin/sendmail)
==2054==    by 0x167AA4: doworklist (in /usr/sbin/sendmail)
==2054==    by 0x17E538: smtp_data (in /usr/sbin/sendmail)
==2054==    by 0x182206: smtp (in /usr/sbin/sendmail)
==2054==    by 0x11742F: main (in /usr/sbin/sendmail)
==2054==  Address 0x51ca6a0 is 0 bytes inside data symbol "_IO_2_1_stderr_"
Comment 4 Krzysztof Olędzki 2023-09-04 05:06:10 UTC
Another observation - adding:

@@ -28,6 +28,9 @@
 #include "winbind_client.h"
 #include <assert.h>

+#undef HAVE_PTHREAD_H
+#undef HAVE_PTHREAD
+
 #ifdef HAVE_PTHREAD_H
 #include <pthread.h>
 #endif

to wb_common.c also seems to fix the issue.

Note however that the current code is broken and does not compile in this situation - function "get_wb_thread_ctx" is not inside the "#ifdef HAVE_PTHREAD" block. Moving "#endif" fixes the problem.

Another thing I notice is the inconsistency in "#ifdef HAVE_PTHREAD_H" vs "#ifdef HAVE_PTHREAD". In particular, "HAVE_PTHREAD_H" is used inside the winbind_destructor function, where I think "HAVE_PTHREAD" should be used instead? Once done, I can only add "#undef HAVE_PTHREAD".
Comment 5 Kacper 2023-09-05 11:21:57 UTC
sendmail isn't the only application affected (by crashing) by "nsswitch: leverage TLS if available in favour over global locking".

I did a git bisect on samba and ended up with the above commit but I haven't been able to pinpoint exactly why the changes introduced makes zabbix crash.

See https://support.zabbix.com/browse/ZBX-22658
Comment 6 Krzysztof Olędzki 2023-09-05 23:19:31 UTC
Created attachment 18079 [details]
Minimalist patch for samba-4.18 to work around the bug

Minimalist patch for samba-4.18 to *work around* the bug by adding "#undef HAVE_PTHREAD".

For this to work, it fixes the two other code issues mentioned in https://bugzilla.samba.org/show_bug.cgi?id=15464#c4 - it moved "#endif" down to also cover the get_wb_thread_ctx function and replaced HAVE_PTHREAD_H with HAVE_PTHREAD inside the winbind_destructor function.
Comment 7 Stefan Metzmacher 2023-09-06 06:51:17 UTC
Comment on attachment 18079 [details]
Minimalist patch for samba-4.18 to work around the bug

This will introduce a thread locking problem.

642a4452ce5b3333c50e41e54bc6ca779686ecc3 needs to be reverted completely in
order to work around the problem
Comment 8 Stefan Metzmacher 2023-09-06 06:55:09 UTC
(In reply to Kacper from comment #5)

In both cases openssl is involved and that generates the first
invalid writes, so I guess the base of the problem is located there,
it's just triggered by nss_winbind bringing in pthread.

The glibc version may also be relevant
Comment 9 Stefan Metzmacher 2023-09-06 08:16:12 UTC
(In reply to Stefan Metzmacher from comment #8)

pthread_key_create() can return 0 as a valid key.

And this in openssl crypto/err/err.c

static void err_delete_thread_state(void *unused)
{
    ERR_STATE *state = CRYPTO_THREAD_get_local(&err_thread_local);
    if (state == NULL)
        return;
        
    CRYPTO_THREAD_set_local(&err_thread_local, NULL);
    OSSL_ERR_STATE_free(state);
}

doesn't check if set_err_thread_local is valid.

It means ERR_STATE *state can be non-NULL coming from
somewhere else.
Comment 10 Krzysztof Olędzki 2023-09-07 05:30:30 UTC
Created attachment 18080 [details]
Revert 642a4452ce5b3333c50e41e54bc6ca779686ecc3 and 7545e2c77b69fc57e436e3ed298fdb68033ce49f
Comment 11 Krzysztof Olędzki 2023-09-07 05:57:29 UTC
(In reply to Stefan Metzmacher from comment #9)

I had a little bit time to look at the problem today and I think I have made some progress in debugging.

First, I checked that err_thread_local in the openssl code is normally a non-zero value, like 6 or 7. However, with the NSS library from samba-4.18, it becomes 0, as this is what pthread_key_create in CRYPTO_THREAD_init_local provides.

Next, I discovered that removing this call from wb_thread_ctx_initialize():
        ret = pthread_atfork(NULL,
                             NULL,
                             wb_atfork_child);
or removing pthread_key_delete(wb_global_ctx.key) from wb_atfork_child() fixes the problem.

Knowing this is fork related, I was able to write a simple code to reproduce *the behavior* (not *the bug*):

--- cut here ---
#include <sys/types.h>

#include <grp.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

int main(void)
{
        int rv;
        pid_t pid;

        pthread_key_t key1a, key1b;
        pthread_key_t key2;
        pthread_key_t key3;


        printf("Starting.\n");

        rv = initgroups("root", 0);
        printf("initgroups: %d\n", rv);

        pthread_key_create(&key1a, NULL);
        pthread_key_create(&key1b, NULL);
        printf("key1a=%d, key1b=%d\n", key1a, key1b);

        pid = fork();

        pthread_key_create(&key2, NULL);
        pthread_key_create(&key3, NULL);
        printf("Hello after fork (%s), pid=%d, key2=%d, key3=%d\n", pid?"parent":"child", pid, key2, key3);

}
--- cut here ---

**** With libnss_winbind.so.2 from samba-4.17 I get:

Starting.
initgroups: 0
key1a=0, key1b=1
Hello after fork (parent), pid=5058, key2=2, key3=3
Hello after fork (child), pid=0, key2=2, key3=3

**** With libnss_winbind.so.2 from samba-4.18 I get:
Starting.
initgroups: 0
key1a=1, key1b=2
Hello after fork (parent), pid=5117, key2=3, key3=4
Hello after fork (child), pid=0, key2=0, key3=3

So, with the nss from samba-4.17 we get 0+1 and 2+3 (parent) / 2+3 (child) allocated.
With the nss from samba-4.18 we get 1+2 (where 0 is allocated in the nss library) and then 3+4 (parent) and 0+3 (child) allocated, 0+3 as 0 has been released.

Unfortunately I may not have additional time to look at this more today, so sharing what I have learned so far.
Comment 12 Krzysztof Olędzki 2023-09-07 06:51:07 UTC
(In reply to Krzysztof Olędzki from comment #11)

Alright, here is a potential fix:

--- 1/nsswitch/wb_common.c	2023-09-02 13:38:34.506064173 -0700
+++ 2/nsswitch/wb_common.c	2023-09-06 23:43:49.393985656 -0700
@@ -66,6 +66,12 @@
 	struct winbindd_context *ctx = NULL;
 	int ret;
 
+	if (!wb_global_ctx.initialized) {
+		return;
+	}
+
+	wb_global_ctx.initialized = false;
+
 	ctx = (struct winbindd_context *)pthread_getspecific(wb_global_ctx.key);
 	if (ctx == NULL) {
 		return;


Without this, every time wb_atfork_child() is called, it calls pthread_key_delete with the original wb_global_ctx.key even if it has been already deleted and the same key is re-used in other place.
Comment 13 Krzysztof Olędzki 2023-09-07 07:32:03 UTC
Created attachment 18081 [details]
b15464-testcase.c
Comment 14 Krzysztof Olędzki 2023-09-07 07:35:55 UTC
Created attachment 18082 [details]
Potential fix
Comment 15 Krzysztof Olędzki 2023-09-07 07:39:57 UTC
(In reply to Krzysztof Olędzki from comment #13)

Buggy:

# ./b15464-testcase
18303: k1=1
18304: Hello after fork, k1=1, k2=0
18305: Hello after fork2, k1=1, k2=0, k3=0
FAIL


Fixed (or samba-4.17):

# ./b15464-testcase
18310: k1=0
18311: Hello after fork, k1=0, k2=1
18312: Hello after fork2, k1=0, k2=1, k3=2
OK
Comment 16 Krzysztof Olędzki 2023-09-07 07:55:50 UTC
(In reply to Krzysztof Olędzki from comment #15)

Correct output for the "samba-4.18-fixed" case - the one above listed as "fixed" is from samba-4.17 - w/o Thread Local Storage (TLS):

# ./b15464-testcase
18509: k1=1
18510: Hello after fork, k1=1, k2=0
18511: Hello after fork2, k1=1, k2=0, k3=2
OK
Comment 17 Stefan Metzmacher 2023-09-07 10:58:55 UTC
(In reply to Krzysztof Olędzki from comment #12)

Great detective work, thanks!

Do you want to provide the fix as git format-patch output?

If so please also see:
https://wiki.samba.org/index.php/Contribute
(skip step 2.2.3 Fork the Samba repo (just until we get to know you))
and jump to
https://wiki.samba.org/index.php/Samba_on_GitLab#Other_Samba_developers

Note that they might an additional bug regarding leaking of
winbindd_context structures and there related socket file descriptors.
But that's a more complex task and should not hold us back from
pushing the fix that causes invalid writes of unrelated code.
Comment 18 Stefan Metzmacher 2023-09-07 14:50:45 UTC
Ok, I tried to more complete fix, that also tries to avoid fd and memory leaks.

Only compile tested, see https://gitlab.com/samba-team/samba/-/merge_requests/3259

It would be great if someone could test this in a real setup,
it should also fix the problem of this bug and I'd assume to
see an output like this:

# ./b15464-testcase
18509: k1=1
18510: Hello after fork, k1=1, k2=2
18511: Hello after fork2, k1=1, k2=2, k3=3
OK
Comment 19 Krzysztof Olędzki 2023-09-07 21:54:18 UTC
(In reply to Stefan Metzmacher from comment #18)

It seems like a much larger change, so I have not been able to look at the code yet, just compiled and tested. Sadly, it triggers an assertion:

b15464-testcase: ../../nsswitch/wb_common.c:99: wb_atfork_child: Assertion `ctx_ptr == NULL' failed.

For now, do you still want me to provide the git-patch? I wonder if it make sense to fix the bug in a simple manner for the 4.18 and 4.19 branches, and then work on a more comprehensive change for the 4.20, so we have more time for testing?

Two more comments:
 1. for the "fix build without HAVE_PTHREAD" do should we also rename HAVE_PTHREAD_H to HAVE_PTHREAD inside winbind_destructor in that same patch? Happy to take care of it if you want, BTW.

 2. I'm little bit concerned about "leaking" one "atfork handler" form a process that accessed libnss_winbind. These handles seem to to be stored in a linked list and require a malloc. There seems to be __unregister_atfork but I'm not sure once this gets triggered.
Comment 20 Stefan Metzmacher 2023-09-08 08:58:55 UTC
(In reply to Krzysztof Olędzki from comment #19)

I pushed a fixed version.

The minimal fix for the problem is this:
https://gitlab.com/samba-team/samba/-/merge_requests/3259/diffs?commit_id=8162c1b0cccc29d6b76567e3e2c41f985ada0cbe

It just avoids calling pthread_key_delete() in wb_atfork_child().
And if wb_atfork_child was registered we also know that
pthread_key_create() was called with success.
Comment 21 Stefan Metzmacher 2023-09-08 12:06:29 UTC
(In reply to Krzysztof Olędzki from comment #19)

A dlclose() deinstalls the atfork handlers...
Comment 22 Kacper 2023-09-08 16:57:31 UTC
(In reply to Stefan Metzmacher from comment #20)

The patches in https://gitlab.com/samba-team/samba/-/merge_requests/3259 work without issues as far as Zabbix is concerned.

Any chance we can get the final fixes in for 4.18.7?
Comment 23 Krzysztof Olędzki 2023-09-08 17:48:11 UTC
(In reply to Stefan Metzmacher from comment #20)

Thanks!

Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>

Also, if it matters:
Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Comment 24 Stefan Metzmacher 2023-09-11 15:36:17 UTC
(In reply to Kacper from comment #22)

Thanks!

I added a regression test based on your reproducer, see:
https://gitlab.com/samba-team/samba/-/merge_requests/3259/diffs?commit_id=30568253df96514d04931a7adbd8a3ab5aaa17ac

I hope someone will review the changes in order to get it
into 4.18.7...
Comment 25 Krzysztof Olędzki 2023-09-12 00:13:39 UTC
(In reply to Stefan Metzmacher from comment #24)

Thanks!

If we want a more general regression test (not just the PoC for the BUG) then in addition to:
 if (k1 == k2 || k2 == k3)
we probably also want to cover k1 == k3:
 if (k1 == k2 || k2 == k3 || k1 == k3)
Comment 26 Samba QA Contact 2023-09-14 18:54:04 UTC
This bug was referenced in samba master:

62af25d44e542548d8cdecb061a6001e0071ee76
4faf806412c4408db25448b1f67c09359ec2f81f
836823e5047d0eb18e66707386ba03b812adfaf8
91b30a7261e6455d3a4f31728c23e4849e3945b9
4af3faace481d23869b64485b791bdd43d8972c5
Comment 27 Stefan Metzmacher 2023-09-14 21:02:40 UTC
Created attachment 18103 [details]
Patches for v4-19-test
Comment 28 Stefan Metzmacher 2023-09-14 21:03:35 UTC
Created attachment 18104 [details]
Patches for v4-18-test
Comment 29 Jeremy Allison 2023-09-15 16:54:48 UTC
Re-assigned to Jule for inclusion in 4.18.next, 4.19.next.
Comment 30 Jule Anger 2023-09-18 15:59:24 UTC
Pushed to autobuild-v4-{19,18}-test.
Comment 31 Samba QA Contact 2023-09-18 16:56:04 UTC
This bug was referenced in samba v4-19-test:

340b7fd1eec58ccbbfbcf706829b3a8593700cab
61f6f46b26b5207fb411c2d4d4734c3fed0f88a7
9c10f828dfbf44ec09a2ddf9d98bc5248bf5cf22
7d04c32ed7eaacfa7e233a7cc141344041c20fc5
374ba0d2c9a32ade701d7cdd25034692fe055108
Comment 32 Samba QA Contact 2023-09-18 17:26:03 UTC
This bug was referenced in samba v4-18-test:

cb71db6827f2575799d65c8a3560e1748a389889
0ebaac2afe94cf09599970962c66a7cc2761625c
5b9b8b315821c429ecfcb9153aa5308e3c9f5086
3d8e8ed15942374939c95384b5cd03b0162000ad
82d6f8a6ce3918b51a9422101823328084a27ffa
Comment 33 Jule Anger 2023-09-18 20:48:38 UTC
Closing out bug report.

Thanks!
Comment 34 Samba QA Contact 2023-09-27 08:16:43 UTC
This bug was referenced in samba v4-18-stable (Release samba-4.18.7):

cb71db6827f2575799d65c8a3560e1748a389889
0ebaac2afe94cf09599970962c66a7cc2761625c
5b9b8b315821c429ecfcb9153aa5308e3c9f5086
3d8e8ed15942374939c95384b5cd03b0162000ad
82d6f8a6ce3918b51a9422101823328084a27ffa
Comment 35 Samba QA Contact 2023-10-16 14:19:44 UTC
This bug was referenced in samba v4-19-stable (Release samba-4.19.2):

340b7fd1eec58ccbbfbcf706829b3a8593700cab
61f6f46b26b5207fb411c2d4d4734c3fed0f88a7
9c10f828dfbf44ec09a2ddf9d98bc5248bf5cf22
7d04c32ed7eaacfa7e233a7cc141344041c20fc5
374ba0d2c9a32ade701d7cdd25034692fe055108