Bug 10365 - nss_winbind causes hangs with large groups on solaris
Summary: nss_winbind causes hangs with large groups on solaris
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Winbind (show other bugs)
Version: 4.1.3
Hardware: All Solaris
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-09 16:00 UTC by Björn Jacke
Modified: 2015-10-05 07:08 UTC (History)
4 users (show)

See Also:
slow: review+


Attachments
Patch for large groups in solaris nss (1.08 KB, patch)
2015-01-23 21:14 UTC, Nathan Huff
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Jacke 2014-01-09 16:00:43 UTC
on solaris getent group DOMAIN\\somegroup

hangs when the group is too large, e.g. there are too many group members. Solaris seems to have a limit for the members of each group (can be determined with the _SC_GETGR_R_SIZE_MAX sysconf(3C) parameter).

when this limit is exceeded the getent group call hangs. The question is if the hanging is a bug of Solaris or if the nss winbind module needs to handle that case of oversized groups differently.
Comment 1 Ira Cooper 2014-01-09 21:10:01 UTC
Alas, I don't have access to a Solaris box (or Illumos box!) right now.

My feeling is: Hanging is broken.

Now, which side do we fix it on... Probably both.
Comment 2 Nathan Huff 2015-01-23 21:14:28 UTC
Created attachment 10659 [details]
Patch for large groups in solaris nss

The problem with large groups on Solaris/Illumos in the the NSS winbind module is Solaris wants the return value to be NSS_UNAVAIL if the buffer given is too small for getgrnam_r.  The current code return NSS_TRYAGAIN which causes Solaris/Illumos to loop without trying to resize the buffer.

This patch checks for the case of ERANGE and NSS_STATUS_TRYAGAIN and changes the return value to NSS_STATUS_UNAVAIL.  It only covers the group database which is where I and I expect most other people see this issue.  It could also be extended to the other databases.

I have tested this on OmniOS 151006 which has the same NSS code as Illumos-gate and I assume this is still the same in Solaris proper.
Comment 3 Björn Jacke 2015-09-11 10:35:10 UTC
this is fixed in master with d3e51b9cfe3d56530253571e020af72da1877044

Ralf: can you review+ for the cherry-pick to the release branches?
Comment 4 Karolin Seeger 2015-09-15 09:09:24 UTC
Pushed to autobuild-v4-[2|3]-test.
Comment 5 Karolin Seeger 2015-10-05 07:08:14 UTC
(In reply to Karolin Seeger from comment #4)
Pushed to both branches.
Closing out bug report.

Thanks!