Created attachment 7406 [details] example output of --leak-report The memory usage of the drepl and rpc server processes increases over time, maybe indicating memory leaks. Output of --leak-report is attached. Is there any other useful information to extract to help tracking this down?
Created attachment 7446 [details] leak report of drepl and rpc server
Created attachment 7551 [details] leak report with one of the processes owning 2GB of memory (drepl?)
Created attachment 7552 [details] leak report with two of the processes owning 1.5GB of memory (rpc_server and drepl?) As I understand this might be caused by long running memory contexts? The full leak report does not reveal any insight to me yet. Is there any other information that could be useful to narrow this down?
As Metze explained to me, similar behavior was seen before https://lists.samba.org/archive/samba-technical/2010-November/074397.html Maybe the dynamic mmap thresholding in glibc/eglibc doesn't play well here, which dynamically increases the mmap threshold on each free() *and* increases the MALLOC_TRIM_THRESHOLD to twice the adjusted MALLOC_MMAP_THRESHOLD. See e.g. http://sources.redhat.com/ml/libc-alpha/2006-03/msg00033.html To experiment it might be useful to disable this dynamic behavior by setting the environment variable MALLOC_TRIM_THRESHOLD_ e.g. to the static default of 128*1024 for the samba process environment.
Tests with Metzes malloc-reclaim.c example code show that adjustment of the malloc TRIM and/or MMAP thresholds does not reduce the problem. shell# gcc -o malloc-reclaim malloc-reclaim.c shell# MALLOC_TRIM_THRESHOLD_=1 MALLOC_MMAP_THRESHOLD_=1 ./malloc-reclaim On the contrary, while using using MMAP more or less exclusively makes free() actually return the freed memory immediately, malloc allocates memory pages only in 4k granularity, which isn't efficient in the first place and results in a much larger memory footprint. Fixing only the TRIM threshold might help in some cases, but not for the specific example of the malloc-reclaim.c example code, because this is explicetely designed to create a large and lasting hole in the heap, while TRIM only acts on unused chunks on the top of the heap. So I guess unfortunately Tridge might be right, advising to reorder the allocation strategy. In this case drepl and rpc_server would need some scrutiny.
Can we get that with --leak-report-full rather than just --leak-report? Thanks,
Created attachment 7765 [details] My proposed fix (currently under test) Based on a clear log provided by Ricky Nace, I was easily able to debug this during this afternoon.
Comment on attachment 7765 [details] My proposed fix (currently under test) Hi Andrew, could we use ldb_dn_copy() instead of talloc_reference() please?
A reasonable suggestion. I'll include this in my next autobuild.
With additional patches now in master, this is now fixed. See: 2e1ab13f6ebb2c2cf746457d4783fe9bc5e86de0 a7b8e9f562780dc6a3487644623decd1cff226e2 3c8d8f206b79280604cb79f263e74aa2b681726e