In an Active Directory configuration using idmap_ad I have about 8500 users in four realms, of which only one has uidNumber, gidNumber, etc. set. If I do a 'getent passwd <username>' for that user it lists fine. However 'getent passwd' (i.e. listing all users) does not list that user. It seems to me that two problems are causing this: 1) The loop in winbindd_getpwent() (winbindd_user.c) does not deal well with the case where winbindd_fill_pwent() returns an error 2) libnss_winbindd imposes an overall 30-second timeout on any transaction, which in my case is far too low (and at very least, defeats use of long 'ldap timeout' values) The two patches below (produced by Red Hat's 'gendiff') have fixed my problem. I have taken the liberty of raising the debug level at which 'could not lookup domain user' messages are logged since in my application many thousands are produced, very frequently. --- samba-3.0.21c/source/nsswitch/winbindd_user.c.sparse_users 2006-04-05 10:22:20.000000000 +0100 +++ samba-3.0.21c/source/nsswitch/winbindd_user.c 2006-04-05 13:20:17.000000000 +0100 @@ -75,7 +75,7 @@ { fstring output_username; fstring sid_string; - + if (!pw || !dom_name || !user_name) return False; @@ -656,7 +656,7 @@ /* Start sending back users */ - for (i = 0; i < num_users; i++) { + for (i = 0; i < num_users;) { struct getpwent_user *name_list = NULL; uint32 result; @@ -710,8 +710,10 @@ state->response.length += sizeof(struct winbindd_pw); + i++; + } else - DEBUG(1, ("could not lookup domain user %s\n", + DEBUG(3, ("could not lookup domain user %s\n", name_list[ent->sam_entry_index].name)); } --- samba-3.0.21c/source/nsswitch/wb_common.c.indefinite_wait 2006-04-05 12:05:08.000000000 +0100 +++ samba-3.0.21c/source/nsswitch/wb_common.c 2006-04-05 12:05:49.000000000 +0100 @@ -409,7 +409,7 @@ static int read_sock(void *buffer, int count) { int result = 0, nread = 0; - int total_time = 0, selret; + int selret; /* Read data from socket */ while(nread < count) { @@ -432,12 +432,6 @@ if (selret == 0) { /* Not ready for read yet... */ - if (total_time >= 30) { - /* Timeout */ - close_sock(); - return -1; - } - total_time += 5; continue; }
This looks like the same problem reported in BZ#3024
*** This bug has been marked as a duplicate of 3024 ***
(In reply to comment #2) > *** This bug has been marked as a duplicate of 3024 *** The first part is certainly a duplicate, and the fix proposed in 3024 is more complete than the one here, but the fixed 30-second timeout in 'getent' won't be addressed simply by fixing the loop. Maybe we should create a new BZ for that problem alone?