Bug 14388 - winbindd memory leak on wbinfo -u with security=ADS
Summary: winbindd memory leak on wbinfo -u with security=ADS
Status: ASSIGNED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.7.6
Hardware: All All
: P5 regression (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-19 18:28 UTC by Laurent Menase
Modified: 2020-09-25 11:16 UTC (History)
4 users (show)

See Also:


Attachments
valgrind leak of winbindd while reproducing (2.22 MB, text/plain)
2020-05-19 18:28 UTC, Laurent Menase
no flags Details
Patch (733 bytes, patch)
2020-05-20 10:35 UTC, Volker Lendecke
no flags Details
Patch for 4.11-4.13 (1.05 KB, patch)
2020-09-15 12:40 UTC, Volker Lendecke
slow: review+
vl: review? (cs)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Laurent Menase 2020-05-19 18:28:50 UTC
Created attachment 15995 [details]
valgrind leak  of winbindd while reproducing

This winbindd memory leak applies to any architecture since 4.6, and is not reproduced with 4.5.11
reproduced on ubuntu with samba 4.7.6 and HP-UX with samba 4.9.0
ubuntu# winbindd -V
Version 4.7.6-Ubuntu


It seems to have been introduced by the following change:
https://github.com/samba-team/samba/commit/480c9581a13afc08b20e80d2ff8a45ac8d7f18d3
winbindd_ads.c  in function query_user_list()
line 365 -       ads_msgfree(ads, res);


on user enumeration with wbinfo -u when winbind cache expired in query_user_list() 'res' is not freed before returning, causing the leak.

Since this code is not change on 4.12, this should apply from 4.6 to 4.12

To reproduce:
smb.conf:
[global]
server string = data.test.local
workgroup = NETERT10
realm = NETERT10.LOCAL

encrypt passwords = yes
security = ADS

idmap uid = 10000-20000
idmap gid = 10000-20000
winbind separator = +
winbind enum users = yes
winbind enum groups = yes
winbind use default domain = yes
template shell = /bin/sh
template homedir= /home/TEST/%u
ldap ssl= start tls
dedicated keytab file = /etc/krb5.keytab
kerberos method = default
winbind cache time =  1
   map to guest = bad user
[Homes]
comment = User Homes Directories
browseable = yes
writable = yes
read only = no

the leak is quicker when the number of ADS users is huge for instance with 16000 ADS user

running
# while : ; do echo nbusers $( wbinfo -u  | wc -l ) 2>/dev/null ; ps -aef | grep winbind | grep -v grep |while read a b c; do echo $b  $( pmap $b | grep total ) ; sleep 2; done; done
nbusers 16674 
26318 total 331868K 
27921 total 501528K 
nbusers 16674 
26318 total 331868K 
27921 total 506744K 
nbusers 16674 
26318 total 331868K 
27921 total 511968K 
nbusers 16674 



after 3min gcore 27921    
# strings -a  core.27921 | grep CN=Users,DC= |wc -l 
321215

Valgrind after some time gives:
==21059== 112,112 bytes in 2,002 blocks are possibly lost in loss record 751 of 770
==21059==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21059==    by 0xECE64A4: ber_memcalloc_x (in /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2.10.8)
==21059==    by 0xC9FC74A: ldap_result (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0xC9FF00E: ldap_pvt_search_s (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0xC9FF07F: ldap_search_ext_s (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0x568235C: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568307C: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x5684FC4: ads_do_search_all_args (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568F320: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568F84D: ads_do_search_retry (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x1566A8: ??? (in /usr/sbin/winbindd)
==21059==    by 0x15390B: ??? (in /usr/sbin/winbindd)
....
==21059== 127,352,467 (1,736 direct, 127,350,731 indirect) bytes in 31 blocks are definitely lost in loss record 770 of 770
==21059==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21059==    by 0xECE64A4: ber_memcalloc_x (in /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2.10.8)
==21059==    by 0xC9FC74A: ldap_result (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0xC9FF00E: ldap_pvt_search_s (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0xC9FF07F: ldap_search_ext_s (in /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8)
==21059==    by 0x568235C: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568307C: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x5685036: ads_do_search_all_args (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568F320: ??? (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x568F84D: ads_do_search_retry (in /usr/lib/x86_64-linux-gnu/samba/libads.so.0)
==21059==    by 0x1566A8: ??? (in /usr/sbin/winbindd)
==21059==    by 0x15390B: ??? (in /usr/sbin/winbindd)

attached Valgrind for leak analysis

Probably fix on 4.12 may be like in other functions calling ads_search_retry():
405	DBG_NOTICE("ads query_user_list gave %d entries\n", count);
406
407  done:
408 + 	if (res)
409 + 	 	ads_msgfree(ads, res);
410 	return status;
}
Comment 1 Volker Lendecke 2020-05-20 10:35:17 UTC
Created attachment 15997 [details]
Patch

Entierly correct, thanks a *LOT*. Attached find a git patch.
Comment 2 Volker Lendecke 2020-09-06 08:55:43 UTC
Are you okay with the git patch in https://bugzilla.samba.org/attachment.cgi?id=15997? The "if (res)" is not required, ads_msgfree() already takes care of the NULL case.

Also, we have a policy that the patch author puts in a Signed-Off-By: tag to the git commit message, see

https://wiki.samba.org/index.php/CodeReview#commit_message_tags

Are you okay with that?

Thanks, Volker
Comment 3 Laurent Menase 2020-09-07 07:39:59 UTC
(In reply to Volker Lendecke from comment #2)
Hi Volker,
Yes I completely agree with this version of the patch, and completely agree with your change and the policy.
I have set the if() since it was used in the old code.
Thanks and best regards, Laurent
Comment 4 Samba QA Contact 2020-09-14 13:34:05 UTC
This bug was referenced in samba master:

8f868b0ea0b4795668f7bc0b028cd85686b249fb
Comment 5 Volker Lendecke 2020-09-15 12:40:03 UTC
Created attachment 16227 [details]
Patch for 4.11-4.13

Not going through CI. winbindd_ads.c has not changed since Samba 4.10.
Comment 6 Ralph Böhme 2020-09-18 06:14:30 UTC
Reassigning to Karolin for inclusion in 4.11, 4.12 and 4.13.
Comment 7 Karolin Seeger 2020-09-25 08:42:58 UTC
(In reply to Ralph Böhme from comment #6)
Pushed to autobuild-v4-{13,12,11}-test.
Comment 8 Samba QA Contact 2020-09-25 11:16:10 UTC
This bug was referenced in samba v4-12-test:

3af0ca2ee90bf1d22148bbca7fa624af24a442a7