Bug 1030 - winbind, fail to get domain groups with a lot of users (more than 400)
Summary: winbind, fail to get domain groups with a lot of users (more than 400)
Status: CLOSED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: winbind (show other bugs)
Version: 3.0.2
Hardware: All Solaris
: P3 critical
Target Milestone: none
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact:
URL:
Keywords:
Depends on:
Blocks: 807
  Show dependency treegraph
 
Reported: 2004-02-02 06:14 UTC by Super Icc
Modified: 2005-08-24 10:18 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Super Icc 2004-02-02 06:14:43 UTC
Note: I have this problem ONLY on 3.0.2rc1 and 3.0.2rc2, it works perfectly on 
3.0.1

Description:
When I ask for groups with "getent group", the groups that have a lot of users 
(I have a "Domain Users" group with 400 users) are not in the list, all the
others are in the list.
With "wbinfo -g" I see the groups names.

Details:
- the domain is a NT 4 domain
- compiled samba with --with-winbind and --with-acl-support on Solaris 8
Comment 1 Andy Smith 2004-02-26 12:09:53 UTC
I don't know how important this is but I have found this issue with Samba 3.0.0 
3.0.1 and 3.0.2. This is on Solaris 8 with 3973 user accounts and most recent 
Sun recommended patch cluster as of 26/2/2004.

thanks Andy. "pubsyssamba @ bbc.co.uk"
Comment 2 Hyde Wu 2004-03-09 21:16:38 UTC
I am using Samba 3.0.2a, along with Solaris 9, I applied the latest patch on 
03/09/2004.

I also tried out Mr. John H. Terpstra's notes on Samba & Solaris 9 at 
http://samba.org/~jht/Notes/Samba-Install-Solaris9.txt. But it does not help. I 
still have the same problem.

This is a domain member server. I tried to join it to Win2K Active Directory or 
NT Domain Controller, but the problem is still the same thing. The following 
are some output from my system. I've been trying to get this problem solved 
from over several monthes. I tried out every new version of Samba 3, but still 
the same thing ;-(

bash-2.05$ ../bin/wbinfo -g | wc -l
    3392
bash-2.05$ getent group | wc -l
      44
bash-2.05$ ../bin/wbinfo -u | wc -l
   24310
Comment 3 John Klinger 2004-03-31 10:13:11 UTC
We haven't seen a problem with "getent group", but we do have a problem 
where "id -a <user>" does not return all supplementary groups if one of those 
have a large number users. Reading the comments below, this appears it is 
likely related.

The problem is caused by a Solaris limitation. The max size for a group entry 
buffer in Solaris 8 is 7296 bytes [limits.h and sysconf(_SC_GETGR_R_SIZE_MAX)]. 
Function fill_grent in winbindd_nss_linux.c uses the 7296 byte buffer passed in 
by Solaris (via _nss_winbind_getgrent_solwrap in the case I looked at). For 
groups with large numbers of users, this structure fills up, causing fill_grent 
to return the members found thusfar with an NSS_STATUS_TRYAGAIN return code. 
Solaris does not try again, but stops the group lookup. This also causes all 
remaining groups that have yet to be sent to getgrent to be ignored.

I looked into this a bit and found no way to increase this buffer size or 
otherwise return all groups (not to say there isn't a way). It is possible to 
intercept the NSS_STATUS_TRYAGAIN in the Solaris wrapper and return 
NSS_STATUS_SUCCESS instead. The problematic group will then be returned, albeit 
with only the members that would fit in the buffer. It would also allow the 
followon groups to be returned, instead of stopping with the problematic group.

The only other [weak] idea I have is to somehow return the remaining groups in 
a subsequent query, similar to entering multiple lines in /etc/group. I don't 
see how to reliably do this, though, since Solaris is looping using a group 
list it previously obtained via enum_dom_groups. 
Comment 4 Gerald (Jerry) Carter (dead mail address) 2005-02-08 21:52:41 UTC
I'd retest against 3.0.11.  Lots of winbind changes.  reopen if you 
find the bug still exists.  Thanks.
Comment 5 Michael Agard 2005-04-25 14:17:40 UTC
(In reply to comment #4)
> I'd retest against 3.0.11.  Lots of winbind changes.  reopen if you 
> find the bug still exists.  Thanks.

This still happens under solaris 9 (Solaris 5.9 Generic_118558-06) on Sparc,
with Samba 3.0.14a.  It seems to be referenced in a few other bugs,  related (it
seems) to very large numbers of users in groups on Solaris.   
Comment 6 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:18:22 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.