Created attachment 8390 [details] Build configuration. 'smbd' leaks file handles when serving remote file access requests for Windows 'emacs' 23.3.1. In particular "revert" operations, especially with "auto-revert-tail-mode" provoke the issue. Client system is Windows Server 2008 SP2 x64, NT6.0. Works ok with RHEL/CentOS 5.8 64-bit. Exact 'emacs' version: GNU Emacs 23.3.1 (i386-mingw-nt6.0.6002) of 2011-03-10 on 3249CTO Exact kernel is 2.6.9-103.EL i686, libc is glibc-2.3.4-2.57. Problem produced with Samba 3.6.4 compiled without AIO in 'configure' and with Samba 3.6.10 with configure --with-aio-support GCC 4.7.1. Have configured max protocol = SMB2 Probably compiling Samba with attached config.h will allow reproduction of the problem even with current versions of Linux. Possible that 'glibc' 2.3.4 is a factor. 'strace' illustrating issue attached. The leak appears related to open("log") events where /var/log is opened and scanned periodically. See 'open' calls without matching close. Most of the /proc/###/fd entries are for /var/log, but a few are for other files: # ls -o /proc/15056/fd | sort -k8,8n . . . lr-x------ 1 root 64 Jan 7 17:48 531 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 532 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 533 -> /w lr-x------ 1 root 64 Jan 7 17:48 534 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 535 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 536 -> /w/home lr-x------ 1 root 64 Jan 7 17:48 537 -> /w/home/awle lr-x------ 1 root 64 Jan 7 17:48 538 -> /w/home/awle lr-x------ 1 root 64 Jan 7 17:48 539 -> /w/home/awle lr-x------ 1 root 64 Jan 7 17:48 540 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 541 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 542 -> /var/log . . . lr-x------ 1 root 64 Jan 7 17:48 572 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 573 -> /var/log lr-x------ 1 root 64 Jan 7 17:48 574 -> /var/log lr-x------ 1 root 64 Jan 7 17:49 575 -> /var/log l-wx------ 1 root 64 Jan 7 17:49 576 -> /usr/local/samba/var/clientlog/ciannait.log lr-x------ 1 root 64 Jan 7 17:49 577 -> /var/log lr-x------ 1 root 64 Jan 7 17:49 578 -> /var/log lr-x------ 1 root 64 Jan 7 17:49 579 -> /var/log Difference between config.h for CentOS 5 (works ok) and CentOS 4 (leaks handles): < #define HAVE_ATTR_XATTR_H 1 > /* #undef HAVE_ATTR_XATTR_H */ < /* #undef HAVE_BROKEN_POSIX_FALLOCATE */ > #define HAVE_BROKEN_POSIX_FALLOCATE < #define HAVE_FDOPENDIR 1 > /* #undef HAVE_FDOPENDIR */ < #define HAVE_INOTIFY 1 > /* #undef HAVE_INOTIFY */ < #define HAVE_INOTIFY_INIT 1 > /* #undef HAVE_INOTIFY_INIT */ < #define HAVE_KRB5_LOCATE_PLUGIN_H 1 > /* #undef HAVE_KRB5_LOCATE_PLUGIN_H */ < #define HAVE_LINUX_DQBLK_XFS_H 1 > /* #undef HAVE_LINUX_DQBLK_XFS_H */ < #define HAVE_LINUX_FALLOC_H 1 > /* #undef HAVE_LINUX_FALLOC_H */ < #define HAVE_LINUX_INOTIFY_H 1 > /* #undef HAVE_LINUX_INOTIFY_H */ < #define HAVE_LINUX_SPLICE 1 > /* #undef HAVE_LINUX_SPLICE */ < /* #undef HAVE_NFS_QUOTAS */ > #define HAVE_NFS_QUOTAS 1 < #define HAVE_SPLICE_DECL 1 > /* #undef HAVE_SPLICE_DECL */ < #define HAVE_SYS_INOTIFY_H 1 > /* #undef HAVE_SYS_INOTIFY_H */ < /* #undef HAVE_UT_UT_TV */ > #define HAVE_UT_UT_TV 1 < #define HAVE_XFS_QUOTAS 1 > /* #undef HAVE_XFS_QUOTAS */ < #define HAVE___NR_INOTIFY_INIT_DECL 1 > /* #undef HAVE___NR_INOTIFY_INIT_DECL */ < #define SIZEOF_LONG 8 > #define SIZEOF_LONG 4 < #define SIZEOF_SIZE_T 8 > #define SIZEOF_SIZE_T 4 < #define SIZEOF_SSIZE_T 8 > #define SIZEOF_SSIZE_T 4 < #define SIZEOF_TIME_T 8 > /* #undef SIZEOF_TIME_T */ < #define SIZEOF_VOID_P 8 > #define SIZEOF_VOID_P 4 < #define TIME_T_MAX 67768036191676799ll > /* #undef TIME_T_MAX */
Created attachment 8391 [details] Build configuration.
Created attachment 8392 [details] Runtime configuration.
Created attachment 8393 [details] 'strace' showing leak
Zoom-in on handle-leak syscalls =============================== Good system: open("log", O_RDONLY|O_DIRECTORY) = 28 fstat(28, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 fstat(28, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 fcntl(28, F_GETFL) = 0x18000 (flags O_RDONLY|O_LARGEFILE|O_DIRECTORY) fcntl(28, F_SETFD, FD_CLOEXEC) = 0 getdents(28, /* 184 entries */, 32768) = 5880 getdents(28, /* 0 entries */, 32768) = 0 close(28) = 0 open(".", O_RDONLY|O_DIRECTORY) = 28 fstat(28, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 close(28) = 0 open("log", O_RDONLY|O_DIRECTORY) = 28 fstat(28, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 fstat(28, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 fcntl(28, F_GETFL) = 0x18000 (flags O_RDONLY|O_LARGEFILE|O_DIRECTORY) fcntl(28, F_SETFD, FD_CLOEXEC) = 0 getdents(28, /* 184 entries */, 32768) = 5880 getdents(28, /* 0 entries */, 32768) = 0 close(28) = 0 Bad system: open("log", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 509 fstat64(509, {st_mode=S_IFDIR|0751, st_size=4096, ...}) = 0 fcntl64(509, F_SETFD, FD_CLOEXEC) = 0 getdents64(509, /* 129 entries */, 4096) = 4072 getdents64(509, /* 46 entries */, 4096) = 1488 getdents64(509, /* 0 entries */, 4096) = 0 close(509) = 0 open(".", O_RDONLY|O_LARGEFILE|O_DIRECTORY) = 509 fstat64(509, {st_mode=S_IFDIR|0751, st_size=4096, ...}) = 0 close(509) = 0 open("log", O_RDONLY|O_LARGEFILE|O_DIRECTORY) = 509 fstat64(509, {st_mode=S_IFDIR|0751, st_size=4096, ...}) = 0
To narrow it down further, copied the 32-bit 'smbd' daemon binary along with required shared libraries from the CentOS 4 system to the CentOS 5 system and ran the test scenario. The file handle leak remains when running under the more current 2.6.18-308.24.1.el5 kernel and 2.5-81.el5_8.7 glibc. In particular this eliminates the older 2.3.4 'glibc' as the culprit and indicates the bug will reproduce on any system when the attached config.h file is used during compilation. 32-bit libraries copied with 'smbd' libcap.so.1 (CentOS 4) libpopt.so.0 (CentOS 4) libtalloc.so.2 (Samba) libtdb.so.1 (Samba) libwbclient.so (Samba) 32-bit libraries present on 64-bit CentOS 5 system linux-gate.so.1 libcrypt.so.1 libresolv.so.2 libnsl.so.1 libdl.so.2 librt.so.1 libc.so.6 /lib/ld-linux.so.2 libpthread.so.0
OCD took over and I nailed the cause down to #undef HAVE_FDOPENDIR 1) compiled it CentOS 5 64-bit with lightly adapted 'config.h' from 32-bit CentOS 4 system. Reproduces bug. 2) on a hunch tweaked #undef HAVE_FDOPENDIR to #define HAVE_FDOPENDIR 1 and now it works with no file handle leaks.
Suspect problem lies at dir.c line 590 where a guess is that the "Ugly hack. . .ENOSYS" in void dptr_CloseDir(files_struct *fsp) is not operating as intended.
Created attachment 8396 [details] fix patch fixed Hack wasn't ugly enough ;-)
The question is -- why did configure not find fdopendir?
Well originally the subject of this bug was "[Bug 9551] file handle leak under RHEL/CentOS 4.9 32-bit" I forgot to include this in the body text and once the exact cause was determined, accidently erased the information. It does appears again buried in the middle. 'autoconf' made the decision based on the lack of 'fdopendir()' in 'glibc' version 2.3.4, which is what comes with RHEL 4 / CentOS 4. You will notice that the fix is a one-line tweak where the author was thinking the right thought but his fingers left out a fragment. Happens to me often enough.
I see the section of code in question was rewritten somewhere between 3.6.10 and 3.6.24, and that the revised code is not susceptible to the problem. Marking it closed/resolved.