Bug 1428 - Winbind and rpc_client timeout on large domains using LDAP
Winbind and rpc_client timeout on large domains using LDAP
Status: RESOLVED FIXED
Product: Samba 3.0
Classification: Unclassified
Component: winbind
3.0.4
All Linux
: P3 critical
: none
Assigned To: Samba Bugzilla Account
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-06-04 07:33 UTC by Nuno Beirão
Modified: 2005-09-29 09:07 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nuno Beirão 2004-06-04 07:33:31 UTC
Hi,

I have a samba domain with about 8000 users running with LDAP backend. I 
recently migrated from 2.2.8a to 3.0.4 and haven't had any problems until now.
I'm trying to join a samba server (3.0.4) to my domain as a member server and 
use winbind for rid to uid mapping.
I joined the domain, started winbind and tried to test winbind using wbinfo.

# wbinfo -t
checking the trust secret via RPC calls succeeded

# wbinfo -g
BUILTIN\System Operators
BUILTIN\Replicators
BUILTIN\Guests
BUILTIN\Power Users
BUILTIN\Print Operators
BUILTIN\Administrators
BUILTIN\Account Operators
BUILTIN\Backup Operators
BUILTIN\Users

# wbinfo -u
Error looking up domain users

****************************************
smb.conf on member server:

[global]
   workgroup = LOTR
   server string = Samba Server
   netbios name = ANCORA
   security = domain

   idmap uid = 10000-20000
   idmap gid = 10000-20000
   winbind enum users = yes
   winbind enum groups = yes

   log level = 1 winbind:10
   log file = /var/log/samba/%m.log
   max log size = 50

   socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
   wins server = 193.136.185.8

# Share Definitions
[homes]
   comment = Home Directories
   browseable = no
   writable = yes
****************************************

when doing a "net rpc info" I get the following output 

# net rpc info
[2004/06/04 15:22:04, 0] rpc_client/cli_pipe.c:rpc_api_pipe(435)
  cli_pipe: return critical error. Error was Call timed out: server did not 
respond after 10000 milliseconds

after much trial and error, I tried to limit the number of users to 6000 using 
the "ldap filter" clause in smb.conf of the PDC server and it now works 
prefectly:

# wbinfo -g
BUILTIN\System Operators
BUILTIN\Replicators
BUILTIN\Guests
BUILTIN\Power Users
BUILTIN\Print Operators
BUILTIN\Administrators
BUILTIN\Account Operators
BUILTIN\Backup Operators
BUILTIN\Users
LOTR\Staff
LOTR\Domain Admins
LOTR\Domain Users
LOTR\Domain Guests
LOTR\Administrators
LOTR\Users
LOTR\Guests
LOTR\Power Users
LOTR\Account Operators
LOTR\Server Operators
LOTR\Print Operators
LOTR\Backup Operators
LOTR\Replicator
LOTR\Domain Computers

and "net rpc info" works as expected.

**********************************************
this is a level 10 log of winbind after doing a wbinfo -g

[2004/06/04 15:17:03, 3] nsswitch/winbindd_util.c:add_trusted_domain(173)
  add_trusted_domain: LOTR is an NT4  domain
[2004/06/04 15:17:03, 1] nsswitch/winbindd_util.c:add_trusted_domain(180)
  Added domain LOTR  S-0-0
[2004/06/04 15:17:03, 3] nsswitch/winbindd_cm.c:cm_get_ipc_userpass(107)
  IPC$ connections done by user LOTR\root
[2004/06/04 15:17:03, 5] nsswitch/winbindd_cm.c:cm_open_connection(277)
  connecting to CAVADO from ANCORA with username [LOTR]\[root]
[2004/06/04 15:17:03, 5] nsswitch/winbindd_cache.c:get_cache(131)
  get_cache: Setting MS-RPC methods for domain LOTR
[2004/06/04 15:17:03, 10] nsswitch/winbindd_cache.c:wcache_flush_cache(66)
  wcache_flush_cache success
[2004/06/04 15:17:03, 10] nsswitch/winbindd_cache.c:alternate_name(1326)
  alternate_name: [Cached] - doing backend query for info for domain LOTR
[2004/06/04 15:17:03, 5] nsswitch/winbindd_util.c:add_trusted_domains(207)
  scanning trusted domain list
[2004/06/04 15:17:03, 10] nsswitch/winbindd_cache.c:trusted_domains(1301)
  trusted_domains: [Cached] - doing backend query for info for domain LOTR
[2004/06/04 15:17:03, 3] nsswitch/winbindd_rpc.c:trusted_domains(925)
  rpc: trusted_domains
[2004/06/04 15:17:03, 3] nsswitch/winbindd_cm.c:cm_get_ipc_userpass(107)
  IPC$ connections done by user LOTR\root
[2004/06/04 15:17:03, 5] nsswitch/winbindd_cm.c:cm_open_connection(277)
  connecting to CAVADO from ANCORA with username [LOTR]\[root]
[2004/06/04 15:17:04, 3] nsswitch/winbindd_util.c:add_trusted_domain(173)
  add_trusted_domain: BUILTIN is an NT4  domain
[2004/06/04 15:17:04, 1] nsswitch/winbindd_util.c:add_trusted_domain(180)
  Added domain BUILTIN  S-1-5-32
[2004/06/04 15:17:04, 3] nsswitch/winbindd_util.c:add_trusted_domain(173)
  add_trusted_domain: ANCORA is an NT4  domain
[2004/06/04 15:17:04, 1] nsswitch/winbindd_util.c:add_trusted_domain(180)
  Added domain ANCORA  S-1-5-21-1150505906-3358595082-1800563953
[2004/06/04 15:17:04, 5] nsswitch/winbindd_util.c:add_trusted_domains(207)
  scanning trusted domain list
[2004/06/04 15:17:04, 10] nsswitch/winbindd_cache.c:trusted_domains(1301)
  trusted_domains: [Cached] - doing backend query for info for domain LOTR
[2004/06/04 15:17:04, 3] nsswitch/winbindd_rpc.c:trusted_domains(925)
  rpc: trusted_domains
[2004/06/04 15:17:04, 10] nsswitch/winbindd_util.c:open_winbindd_socket(673)
  open_winbindd_socket: opened socket fd 16
[2004/06/04 15:17:04, 10] nsswitch/winbindd_util.c:open_winbindd_priv_socket
(685
)
  open_winbindd_priv_socket: opened socket fd 17
[2004/06/04 15:17:33, 6] nsswitch/winbindd.c:new_connection(343)
  accepted socket 18
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:winbind_client_read(458)
  client_read: read 1824 bytes. Need 0 more for a full request.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:process_request(308)
  process_request: request fn INTERFACE_VERSION
[2004/06/04 15:17:33, 3] nsswitch/winbindd_misc.c:winbindd_interface_version
(261
)
  [ 5265]: request interface version
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:client_write(512)
  client_write: wrote 1300 bytes.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:winbind_client_read(458)
  client_read: read 1824 bytes. Need 0 more for a full request.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:process_request(308)
  process_request: request fn WINBINDD_PRIV_PIPE_DIR
[2004/06/04 15:17:33, 3] nsswitch/winbindd_misc.c:winbindd_priv_pipe_dir(297)
  [ 5265]: request location of privileged pipe
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:client_write(512)
  client_write: wrote 1300 bytes.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:client_write(557)
  client_write: need to write 37 extra data bytes.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:client_write(512)
  client_write: wrote 37 bytes.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:client_write(546)
  client_write: client_write: complete response written.
[2004/06/04 15:17:33, 6] nsswitch/winbindd.c:new_connection(343)
  accepted socket 19
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:winbind_client_read(458)
  client_read: read 0 bytes. Need 1824 more for a full request.
[2004/06/04 15:17:33, 5] nsswitch/winbindd.c:winbind_client_read(465)
  read failed on sock 18, pid 5265: EOF
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:winbind_client_read(458)
  client_read: read 1824 bytes. Need 0 more for a full request.
[2004/06/04 15:17:33, 10] nsswitch/winbindd.c:process_request(308)
  process_request: request fn LIST_GROUPS
[2004/06/04 15:17:33, 3] nsswitch/winbindd_group.c:winbindd_list_groups(848)
  [ 5265]: list groups
[2004/06/04 15:17:33, 4] nsswitch/winbindd_group.c:get_sam_group_entries(564)
  get_sam_group_entries: Native Mode 2k domain; enumerating local groups as 
well
[2004/06/04 15:17:33, 4] nsswitch/winbindd_group.c:get_sam_group_entries(573)
  get_sam_group_entries: Returned 0 local groups
[2004/06/04 15:17:33, 4] nsswitch/winbindd_group.c:get_sam_group_entries(564)
  get_sam_group_entries: Native Mode 2k domain; enumerating local groups as 
well
[2004/06/04 15:17:33, 4] nsswitch/winbindd_group.c:get_sam_group_entries(573)
  get_sam_group_entries: Returned 9 local groups
[2004/06/04 15:17:33, 10] nsswitch/winbindd_cache.c:fetch_cache_seqnum(272)
  fetch_cache_seqnum: invalid data size key [SEQNUM/LOTR]
[2004/06/04 15:17:33, 10] nsswitch/winbindd_rpc.c:sequence_number(850)
  rpc: fetch sequence_number for LOTR
[2004/06/04 15:17:33, 3] nsswitch/winbindd_cm.c:cm_get_ipc_userpass(107)
  IPC$ connections done by user LOTR\root
[2004/06/04 15:17:33, 5] nsswitch/winbindd_cm.c:cm_open_connection(277)
  connecting to CAVADO from ANCORA with username [LOTR]\[root]
[2004/06/04 15:17:43, 0] rpc_client/cli_pipe.c:rpc_api_pipe(435)
  cli_pipe: return critical error. Error was Call timed out: server did not 
resp
ond after 10000 milliseconds
[2004/06/04 15:17:43, 10] nsswitch/winbindd_rpc.c:sequence_number(899)
  domain_sequence_number: failed to get sequence number (4294967295) for 
domain 
LOTR
[2004/06/04 15:17:43, 0] rpc_client/cli_pipe.c:rpc_api_pipe(435)
  cli_pipe: return critical error. Error was Call timed out: server did not 
resp
ond after 10000 milliseconds
[2004/06/04 15:17:43, 10] nsswitch/winbindd_cache.c:store_cache_seqnum(325)
  store_cache_seqnum: success [LOTR][4294967295 @ 1086358663]
[2004/06/04 15:17:43, 10] nsswitch/winbindd_cache.c:refresh_sequence_number
(380)
  refresh_sequence_number: LOTR seq number is now -1
[2004/06/04 15:17:43, 3] nsswitch/winbindd_group.c:get_sam_group_entries(539)
  get_sam_group_entries: could not enumerate domain groups! Error: 
NT_STATUS_UNS
UCCESSFUL
[2004/06/04 15:17:43, 10] nsswitch/winbindd.c:client_write(512)
  client_write: wrote 1300 bytes.
[2004/06/04 15:17:43, 10] nsswitch/winbindd.c:client_write(557)
  client_write: need to write 192 extra data bytes.
[2004/06/04 15:17:43, 10] nsswitch/winbindd.c:client_write(512)
  client_write: wrote 192 bytes.
[2004/06/04 15:17:43, 10] nsswitch/winbindd.c:client_write(546)
  client_write: client_write: complete response written.
[2004/06/04 15:17:43, 10] nsswitch/winbindd.c:winbind_client_read(458)
  client_read: read 0 bytes. Need 1824 more for a full request.
[2004/06/04 15:17:43, 5] nsswitch/winbindd.c:winbind_client_read(465)
  read failed on sock 19, pid 5265: EOF

I have tried with samba 3.0.5pre1 on both the PDC and the member server and 
got the same results.
Comment 1 Gerald (Jerry) Carter 2005-09-29 09:07:00 UTC
please retest against 3.0.20a (the current SAMBA_3_0_RELEASE branch) which will
publically be availebl next week.