Bug 11934 - Memory leak in share mode locking.
Memory leak in share mode locking.
Status: RESOLVED FIXED
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.3.5
All All
: P5 normal
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-05-25 06:12 UTC by Hemanth
Modified: 2016-06-01 07:42 UTC (History)
1 user (show)

See Also:


Attachments
git-am fix for 4.4.next, 4.3.next. (1.56 KB, patch)
2016-05-27 17:00 UTC, Jeremy Allison
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hemanth 2016-05-25 06:12:45 UTC
We are using samba 4.3.5 stack and we are observing a memory leak(or huge buildup) in the area of share mode locking.

We have a test which does multiple opens and closes continuously(using multiple threads) on the same directory with some outstanding open handles at any point of time. 

While this test running we have observed the smbd resident memory growing upto 3GB within few hours.

We have grabbed pool-usage from smbcontrol and found that there is a huge leak in the context of share_mode_data.

struct memcache                contains 3246739108 bytes in 40519 blocks (ref 0)
        struct memcache_element        contains     95 bytes in   1 blocks (ref 0)
        struct share_mode_data         contains 3246735971 bytes in 40483 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
 

After doing lot of investigation, we were able to root cause the issue. 
In share_mode_data_destructor(), we make the data blob from current share mode record and we use this blob to update the TDB record corresponding to this share mode.

Memory allocation for this data blob is done in unparse_share_modes() as part of ndr_push_struct_blob using share_mode_data as memory context.

Once the TDB update is done, we are not actually freeing up this blob. Also we are actually reparenting the share_mode_data to NULL(in-memory cache) context which results in piling up these blob entries under NULL context and will only be freed on the last close.

Freeing data blob after TDB update prevents this memory build up. Following patch fixed the issue for us.

--- a/source3/locking/share_mode_lock.c
+++ b/source3/locking/share_mode_lock.c
@@ -440,6 +440,11 @@ static int share_mode_data_destructor(struct share_mode_data *d)
         */
        TALLOC_FREE(d->record);
 
+       /* 
+        * Release dptr as well before reparenting to NULL
+        * (in-memory cache) context.
+        */
+       TALLOC_FREE(data.dptr);
        /*
         * Reparent d into the in-memory cache so it can be reused if the
         * sequence number matches. See parse_share_modes()
Comment 1 Jeremy Allison 2016-05-25 23:53:46 UTC
Date: Wed, 25 May 2016 16:51:34 -0700
From: Jeremy Allison <jra@samba.org>
To: Volker Lendecke <Volker.Lendecke@SerNet.DE>
Cc: Hemanth Thummala <hemanth.thummala@nutanix.com>, "samba-technical@lists.samba.org" <samba-technical@lists.samba.org>, jra@samba.org
Subject: Re: Patch for a memory leak issue in share mode locking.
User-Agent: Mutt/1.5.21 (2010-09-15)

On Wed, May 25, 2016 at 10:27:04AM +0200, Volker Lendecke wrote:
> On Wed, May 25, 2016 at 06:37:59AM +0000, Hemanth Thummala wrote:
> > Hi All,
> >
> > We have found a memory leak issue in share mode locking.
> >
> > I have created bug and updated my findings.
> > https://bugzilla.samba.org/show_bug.cgi?id=11934
> >
> > Attached patch resolves the issue for us. Please let me know if this looks good.
>
> This is a great finding, thanks!
>
> However, I don't fully understand why this is not cleaned up in
> unparse_share_modes via the share_mode_memcache_delete() call. If I get
> the talloc hierarchy right then the data blob is a talloc child of "d",
> and sure, the memcache_add_talloc moves "d" to the NULL context.  But the
> hierarchy below should stay intact. So the share_mode_memcache_delete
> should delete any previous blob via the hierarchy.
>
> Of course we don't need the blob after the store, so your fix is
> right. But the real long-term leak must be somewhere else, your fix will
> only band-aid the deeper issue.

I think I see why.

Because the file is kept open, the struct share_mode_data *d
never gets deleted - it gets bounced between ownership in
the memcache (via:

        TALLOC_FREE()->
                share_mode_data_destructor()->
                        share_mode_memcache_store()

and then back to the mem_ctx in:

        get_share_mode_lock()->
                get_share_mode_lock_internal()->
                        parse_share_modes()->
                                share_mode_memcache_fetch().

inside parse_share_modes():

296         /* See if we already have a cached copy of this key. */
297         d = share_mode_memcache_fetch(mem_ctx, key, &blob);
298         if (d != NULL) {
299                 return d;
300         }

share_mode_memcache_fetch() moves the ownership back onto mem_ctx.

Inside share_mode_memcache_fetch() we have:

        /* Move onto mem_ctx. */
        d = talloc_move(mem_ctx, &ptr);

        /*
         * Now we own d, prevent the cache from freeing it
         * when we delete the entry.
         */
        talloc_set_destructor(d, share_mode_data_nofree_destructor);

        /* Remove from the cache. We own it now. */
        memcache_delete(NULL,
                        SHARE_MODE_LOCK_CACHE,
                        key);

        /* And reset the destructor to none. */
        talloc_set_destructor(d, NULL);

As the file is always open in their test case, when
unparse_share_modes() is called from share_mode_data_destructor()
the struct is owned by the mem_ctx it's been moved onto,
not by the memcache.

The clue to this is in the talloc pool info in the
bug report:

struct memcache                contains 3246739108 bytes in 40519 blocks (ref 0)
        struct memcache_element        contains     95 bytes in   1 blocks (ref 0)
        struct share_mode_data         contains 3246735971 bytes in 40483 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)
            uint8_t                        contains  89088 bytes in   1 blocks (ref 0)

So this is a long winded way of saying that I think that
Hemanth's patch is correct :-).

Reviewed-by: Jeremy Allison <jra@samba.org>.

Volker, do you concur ?
Comment 2 Jeremy Allison 2016-05-27 17:00:12 UTC
Created attachment 12136 [details]
git-am fix for 4.4.next, 4.3.next.

Cherry-pick from master.
Comment 3 Jeremy Allison 2016-05-27 20:16:57 UTC
Re-assigning to Karolin for inclusion in 4.4.next, 4.3.next.
Comment 4 Karolin Seeger 2016-05-30 09:48:58 UTC
(In reply to Jeremy Allison from comment #3)
Pushed to autobuild-v4-[3|4]-test.
Comment 5 Karolin Seeger 2016-06-01 07:42:43 UTC
(In reply to Karolin Seeger from comment #4)
Pushed to both branches.
Closing out bug report.

Thanks!