Bug 13270 - Leak of file descriptor in samba 4.5.11
Summary: Leak of file descriptor in samba 4.5.11
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.5.11
Hardware: IA64 HP-UX
: P5 regression (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-14 09:31 UTC by arjithpe
Modified: 2018-04-12 06:45 UTC (History)
5 users (show)

See Also:


Attachments
Proposed patch. (1.22 KB, patch)
2018-02-14 21:25 UTC, Jeremy Allison
no flags Details
git-am fix for 4.8.0rc, 4.7.next, 4.6.next. (1.55 KB, patch)
2018-02-23 23:01 UTC, Jeremy Allison
slow: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description arjithpe 2018-02-14 09:31:43 UTC
There seems to be FD leak happening for Samba share, due to which accessing file fails with below error, after accessing certain no of files.

../source3/smbd/smb2_server.c:2988(smbd_smb2_request_done_ex)
  smbd_smb2_request_done_ex: idx[1] status[NT_STATUS_TOO_MANY_OPENED_FILES] body[8] dyn[yes:1] at ../source3/smbd/smb2_server.c:3145 

This occurs even if we increase smbd's max open files to 65536.

This issue is not seen in samba 4.5.3, but seen with 4.5.11 and 4.5.15

This issue seems to be caused by fix for CVE-2017-2619.

On investigating further below modification in source3/smbd/smb2_query_directory.c for fixing above CVE  may be causing FD leak.

https://github.com/samba-team/samba/commit/47b6b6f8f58efbabd7e4610f51db61dca2bc157c#diff-30edf5566a0d9e2abf214c7f778830df

Line 328: dptr_CloseDir(fsp);
Do we need to close FD using fd_close() instead of dptr_CloseDir()
Comment 1 arjithpe 2018-02-14 17:08:23 UTC
Simple steps to re  produce it are as below:-

•	Map a samba share on windows machine.
Suppose share is mapped on Z drive.
•	Create a small batch file as below.
:loop
dir z:
goto loop
•	Analysis it with any crash dump utility depending on platform or wait until smbd hits  NT_STATUS_TOO_MANY_OPENED_FILES error
Comment 2 Laurent Menase 2018-02-14 17:54:15 UTC
When default is used regarding wide link, the leaked fds are seen with open done with O_NOFOLLOW flag set, when tracing with strace/tusc. 


A gdb stack trace of the leaked entry allocation looks like apparently to be:
#3  0x60000000fe924020:0 in smb_vfs_call_open () at ../source3/smbd/vfs.c:1643
#4  0x60000000fe8de3d0:0 in non_widelink_open () at ../source3/smbd/open.c:581
#5  0x60000000fe8deb40:0 in fd_open () at ../source3/smbd/open.c:684
#6  0x60000000fea10430:0 in smbd_smb2_query_directory_send ()
    at ../source3/smbd/smb2_query_directory.c:339
#7  0x60000000fea0f630:0 in smbd_smb2_request_process_query_directory ()
    at ../source3/smbd/smb2_query_directory.c:124
#8  0x60000000fe9be270:0 in smbd_smb2_request_dispatch ()
    at ../source3/smbd/smb2_server.c:2644 

Code is
        if (in_flags & SMB2_CONTINUE_FLAG_REOPEN) {
                int flags;
 
                dptr_CloseDir(fsp);
 
                /*
                 * dptr_CloseDir() will close and invalidate the fsp's file
                 * descriptor, we have to reopen it.
                 */                           
                                              
                flags = O_RDONLY;             
#ifdef O_DIRECTORY
                flags |= O_DIRECTORY;         
#endif
                status = fd_open(conn, fsp, flags, 0);
                if (tevent_req_nterror(req, status)) {
                        return tevent_req_post(req, ev);
                }
        }

dptr_CloseDir() code doesn't look like to close the fsp->fh->fd before it is replaced by fd_open() contrarily to what comment indicates.
The code of fd_close() tends to confirm this since it does the 2 dptr_CloseDir() then closed the  fd.

Now fd_close() also check the fsp->fh->ref_count, in case that fd is shared.
Shouldn't we have a fd_reopen() in the open.c API to manage the operation expected there?
Comment 3 Jeremy Allison 2018-02-14 18:19:43 UTC
Is this reproducible on any later release than
4.5.x ?

Knowing that will really help determine if this
is a bug that needs addressing just in 4.5.x or
is a more generic problem.
Comment 4 Jeremy Allison 2018-02-14 18:22:01 UTC
At least in master, dptr_CloseDir(fsp) looks like:

void dptr_CloseDir(files_struct *fsp)
{
        if (fsp->dptr) {
                /*
                 * The destructor for the struct smb_Dir
                 * (fsp->dptr->dir_hnd) now handles
                 * all resource deallocation.
                 */
                dptr_close_internal(fsp->dptr);
                fsp->dptr = NULL;
        }
}

dptr_close_internal() looks like:

static void dptr_close_internal(struct dptr_struct *dptr)
{
        struct smbd_server_connection *sconn = dptr->conn->sconn;

        DEBUG(4,("closing dptr key %d\n",dptr->dnum));

        if (sconn == NULL) {
                goto done;
        }

        if (sconn->using_smb2) {
                goto done;
        }

        DLIST_REMOVE(sconn->searches.dirptrs, dptr);

        /*
         * Free the dnum in the bitmap. Remember the dnum value is always 
         * biased by one with respect to the bitmap.
         */

        if (!bitmap_query(sconn->searches.dptr_bmap, dptr->dnum - 1)) {
                DEBUG(0,("dptr_close_internal : Error - closing dnum = %d and bitmap not set !\n",
                        dptr->dnum ));
        }

        bitmap_clear(sconn->searches.dptr_bmap, dptr->dnum - 1);

done:
        TALLOC_FREE(dptr->dir_hnd);
        TALLOC_FREE(dptr);
}

The TALLOC_FREE(dptr->dir_hnd) triggers the destructor on dptr->dir_hnd, which looks like:

static int smb_Dir_destructor(struct smb_Dir *dirp)
{
        if (dirp->dir != NULL) {
                SMB_VFS_CLOSEDIR(dirp->conn,dirp->dir);
                if (dirp->fsp != NULL) {
                        /*
                         * The SMB_VFS_CLOSEDIR above
                         * closes the underlying fd inside
                         * dirp->fsp.
                         */
                        dirp->fsp->fh->fd = -1;
                        if (dirp->fsp->dptr != NULL) {
                                SMB_ASSERT(dirp->fsp->dptr->dir_hnd == dirp);
                                dirp->fsp->dptr->dir_hnd = NULL;
                        }
                        dirp->fsp = NULL;
                }
        }
        if (dirp->conn->sconn && !dirp->conn->sconn->using_smb2) {
                dirp->conn->sconn->searches.dirhandles_open--;
        }
        return 0;
}

SMB_VFS_CLOSEDIR(dirp->conn,dirp->dir) should be what is closing the file descriptor.
Comment 5 Jeremy Allison 2018-02-14 21:08:56 UTC
Quick question - are you reproducing this on HPUX only ? If so, does HPUX support the fdopendir() call ?
Comment 6 Jeremy Allison 2018-02-14 21:25:47 UTC
Created attachment 13961 [details]
Proposed patch.

Ralph, I think this is the correct fix (and I think this is a problem for all underlying operation systems that don't have an fdopendir() libc library call).

HPE folks, can you test this and get back to me ?

Jeremy.
Comment 7 Laurent Menase 2018-02-14 23:55:08 UTC
Indeed HP-UX do not support fdopendir and SMB_VFS_OPENDIR() fails with ENOSYS.
then dirp->fsp is not set
Comment 8 arjithpe 2018-02-15 08:28:09 UTC
I have ported and tested the patch.
This patch has fixed FD leak.
Thanks..
Comment 9 Jeremy Allison 2018-02-15 18:40:31 UTC
Ralph, once you've +1'ed this I'll get into master and create back-ports for all supported versions. It looks like the correct fix.
Comment 10 Jeremy Allison 2018-02-23 23:01:49 UTC
Created attachment 13979 [details]
git-am fix for 4.8.0rc, 4.7.next, 4.6.next.

Cherry-picked from master.
Comment 11 Ralph Böhme 2018-03-15 21:38:05 UTC
Reassigning to Karolin for inclustion in 4.8, 4.7 and 4.6.
Comment 12 Jeremy Allison 2018-03-16 21:07:46 UTC
Karolin, I don't see this one in 4.8.0 so I think it got dropped. Can you add it to the relevent releases ?

Thanks,

Jeremy.
Comment 13 Karolin Seeger 2018-03-22 20:44:27 UTC
(In reply to Jeremy Allison from comment #12)
Pushed to autobuild-v4-[8,7,6]-test.
Comment 14 Karolin Seeger 2018-04-10 07:21:34 UTC
(In reply to Karolin Seeger from comment #13)
Pushed to v4-8-test and v4-7-test, re-trying autobuild-v4-6-test.
Comment 15 Karolin Seeger 2018-04-12 06:45:24 UTC
(In reply to Karolin Seeger from comment #14)
Pushed to all branches.
Closing out bug report.

Thanks!