3660 – winbindd getpwent does not handle long and sparse user lists well

Bug 3660 - winbindd getpwent does not handle long and sparse user lists well

Summary: winbindd getpwent does not handle long and sparse user lists well

Status:	RESOLVED DUPLICATE of bug 3024

Alias:	None

Product:	Samba 3.0
Classification:	Unclassified
Component:	winbind (show other bugs)
Version:	3.0.21c
Hardware:	x86 Linux

Importance:	P3 normal
Target Milestone:	none
Assignee:	Samba Bugzilla Account
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:	3024
Blocks:
	Show dependency tree / graph

Reported:	2006-04-05 08:40 UTC by Bob Gautier (550 Unknown Recipient)
Modified:	2006-04-21 07:30 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Bob Gautier (550 Unknown Recipient) 2006-04-05 08:40:33 UTC

In an Active Directory configuration using idmap_ad I have about 8500 users in four realms, of which only one has uidNumber, gidNumber, etc. set.  If I do a 'getent passwd <username>' for that user it lists fine.  However 'getent passwd' (i.e. listing all users) does not list that user.

It seems to me that two problems are causing this:

1) The loop in winbindd_getpwent() (winbindd_user.c) does not deal well with the case where winbindd_fill_pwent() returns an error

2) libnss_winbindd imposes an overall 30-second timeout on any transaction, which in my case is far too low (and at very least, defeats use of long 'ldap timeout' values)

The two patches below (produced by Red Hat's 'gendiff') have fixed my problem.
I have taken the liberty of raising the debug level at which 'could not lookup domain user' messages are logged since in my application many thousands are produced, very frequently.

--- samba-3.0.21c/source/nsswitch/winbindd_user.c.sparse_users	2006-04-05 10:22:20.000000000 +0100
+++ samba-3.0.21c/source/nsswitch/winbindd_user.c	2006-04-05 13:20:17.000000000 +0100
@@ -75,7 +75,7 @@
 {
 	fstring output_username;
 	fstring sid_string;
-	
+
 	if (!pw || !dom_name || !user_name)
 		return False;
 	
@@ -656,7 +656,7 @@
 
 	/* Start sending back users */
 
-	for (i = 0; i < num_users; i++) {
+	for (i = 0; i < num_users;) {
 		struct getpwent_user *name_list = NULL;
 		uint32 result;
 
@@ -710,8 +710,10 @@
 			state->response.length += 
 				sizeof(struct winbindd_pw);
 
+			i++;
+
 		} else
-			DEBUG(1, ("could not lookup domain user %s\n",
+			DEBUG(3, ("could not lookup domain user %s\n",
 				  name_list[ent->sam_entry_index].name));
 	}
 
--- samba-3.0.21c/source/nsswitch/wb_common.c.indefinite_wait	2006-04-05 12:05:08.000000000 +0100
+++ samba-3.0.21c/source/nsswitch/wb_common.c	2006-04-05 12:05:49.000000000 +0100
@@ -409,7 +409,7 @@
 static int read_sock(void *buffer, int count)
 {
 	int result = 0, nread = 0;
-	int total_time = 0, selret;
+	int selret;
 
 	/* Read data from socket */
 	while(nread < count) {
@@ -432,12 +432,6 @@
 		
 		if (selret == 0) {
 			/* Not ready for read yet... */
-			if (total_time >= 30) {
-				/* Timeout */
-				close_sock();
-				return -1;
-			}
-			total_time += 5;
 			continue;
 		}

Comment 1 Bob Gautier (550 Unknown Recipient) 2006-04-10 04:08:20 UTC

This looks like the same problem reported in BZ#3024

Comment 2 Gerald (Jerry) Carter (dead mail address) 2006-04-20 07:51:19 UTC


*** This bug has been marked as a duplicate of 3024 ***

Comment 3 Bob Gautier (550 Unknown Recipient) 2006-04-21 07:30:17 UTC

(In reply to comment #2)
> *** This bug has been marked as a duplicate of 3024 ***

The first part is certainly a duplicate, and the fix proposed in 3024 is more complete than the one here, but the fixed 30-second timeout in 'getent' won't be addressed simply by fixing the loop.  Maybe we should create a new BZ for that problem alone?