Bug 13865 - memcache grows larger then allowed maximum size
Summary: memcache grows larger then allowed maximum size
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.10.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on: 13871
Blocks:
  Show dependency treegraph
 
Reported: 2019-03-28 18:05 UTC by Christof Schmitt
Modified: 2019-04-15 07:28 UTC (History)
2 users (show)

See Also:


Attachments
git-am fix for master. (7.81 KB, patch)
2019-03-29 21:57 UTC, Jeremy Allison
no flags Details
Nasty debug patch for investigating the problem :-). (3.81 KB, patch)
2019-03-31 03:24 UTC, Jeremy Allison
no flags Details
patches for 4.9 (12.28 KB, patch)
2019-04-08 23:25 UTC, Christof Schmitt
jra: review+
Details
patches for 4.10 (12.28 KB, patch)
2019-04-08 23:26 UTC, Christof Schmitt
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christof Schmitt 2019-03-28 18:05:54 UTC
memcache_init takes a max_size parameter that defines
the maximum allowed size for the cache. When adding
talloc objects through memcache_add_talloc, the
talloc object becomes part of the cache, but only
the size of a pointer to the object is accounted
in the actual size tracking. This needs to be fixed
to also add the size of the talloc object.

Patch to follow.
Comment 2 Jeremy Allison 2019-03-29 21:57:04 UTC
Created attachment 15021 [details]
git-am fix for master.

Note, this bug depends on the fixes in bug:

https://bugzilla.samba.org/show_bug.cgi?id=13871

otherwise it reliably crashes in:

make test TESTS=samba3.base.rw1

This probably means the patchset should also increase the default stat cache size parameter.
Comment 3 Jeremy Allison 2019-03-31 02:26:05 UTC
Comment on attachment 15021 [details]
git-am fix for master.

This patch is buggy I think. Don't use.
Comment 4 Jeremy Allison 2019-03-31 03:23:41 UTC
OK, I'm coming to the conclusion that this just won't work, sorry :-(.

Adding copious debug messages to memcache show that talloc objects given to the cache can change talloc size whilst being in there :-(.

I'm guessing that some pointers to sub-objects in the talloc tree are being retained outsize - which means that talloc_total_size() on the incoming pointer can be different when the TALLOC_FREE is being called. I don't know exactly where this is being done, but I really suspect it is being done.

I'll upload the (intentionally horrid :-) patchset that dumps out size calculations whenever cache->size is being modified so you can take a look.

To test, add the git-am fix associated with this bug, then patch the debugs on top, then run:

make test TESTS=samba3.base.rw1 >&/tmp/l1

it'll crash (in the added smb_panic) and then you can grep for:

grep 'cache->size' /tmp/l1

to see what I mean.
Comment 5 Jeremy Allison 2019-03-31 03:24:51 UTC
Created attachment 15023 [details]
Nasty debug patch for investigating the problem :-).
Comment 6 Christof Schmitt 2019-04-01 21:21:59 UTC
I found the problem. In the codepath that replaces
an existing talloced object with another talloc object,
i subtracted the size of the replaced talloc object from
the cache size, but forgot to add the size of the new
object. With that fix, the rw1 test passes.

For additional safety, i think we might also want
to store the size of the talloc size in memcache_element,
to ensure that any changes (e.g. allocation of talloc children)
does not change the size that is subtracted later from
the memcache usage.
Comment 7 Jeremy Allison 2019-04-01 22:08:59 UTC
Can you upload the fixed version for me to take a look at ?

Yes, I'd already gotten to the 'store initial talloc_size' solution too :-).

In that case we can remove any calls to talloc_total_size() outside of the first one that adds in the new memory.

Of course no one *should* be modifying memory that's been given to the memcache, but we don't prevent it. And as I found out, the results of getting the size calculation wrong can be bad :-).

I think in order to add this code in we need to make the generic memcache size at least 16MB by default, as now accounting for the talloc stored sizes may change smbd behavior in unexpected ways.
Comment 8 Christof Schmitt 2019-04-01 23:41:55 UTC
I sent the update here:
https://lists.samba.org/archive/samba-technical/2019-April/133173.html

The default size of the pool has not been updated since,
i am not sure what a sensible increase would be. Adding
a number is the easy part, maybe we can wait for some
feedback on samba-technical
Comment 10 Christof Schmitt 2019-04-08 23:25:53 UTC
Created attachment 15050 [details]
patches for 4.9
Comment 11 Christof Schmitt 2019-04-08 23:26:25 UTC
Created attachment 15051 [details]
patches for 4.10
Comment 12 Jeremy Allison 2019-04-09 18:46:03 UTC
Re-assigning to Karolin for inclusion in 4.9.next, 4.10.next.
Comment 13 Jeremy Allison 2019-04-09 18:46:18 UTC
Re-assigning to Karolin for inclusion in 4.9.next, 4.10.next.
Comment 14 Karolin Seeger 2019-04-11 11:36:22 UTC
(In reply to Jeremy Allison from comment #13)
Pushed to autobuild-v4-{9,10}-test.
Comment 15 Karolin Seeger 2019-04-15 07:28:29 UTC
(In reply to Karolin Seeger from comment #14)
Pushed to both branches.
Closing out bug report.

Thanks!