Bug 1237 - Perfomance issue with samr_query_dom_info level 2 breaks winbind
Summary: Perfomance issue with samr_query_dom_info level 2 breaks winbind
Status: CLOSED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: File Services (show other bugs)
Version: 3.0.2a
Hardware: All All
: P3 normal
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact:
URL:
Keywords:
: 1243 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-04-02 10:04 UTC by John Janosik
Modified: 2005-08-24 10:16 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Janosik 2004-04-02 10:04:21 UTC
We are testing the migration of our NT4 domain to a Samba 3 Domain.  Our member
servers are currently running Samba 2.2.8a on RedHat linux 9.  After migrating
~800 groups and ~8000 user/computer accounts to the Samba 3 DC with an openldap
backend we cannot successfully start winbindd on a Samba 2.2.8a member server
joined to the migrated domain.

In a network trace between the member server and the DC I see a query_dom_info
level 2 call that never completes.  Looking in rpc_server/srv_samr_nt.c I see
that Samba is loading each sam entry and iterating through them to get a count
of users and groups.

I worked around the problem by adding two new functions to the ldap passdb
backend, pdb_getusercount and pdb_getgroupcount.  These just do one ldap query
to get the number of groups or users.

Is there a better way to approach this problem.  I could probably tune my
openldap server better to avoid the timeout but I worry that won't work if we
want to scale to 100,000s of users.
Comment 1 John Janosik 2004-04-08 09:47:15 UTC
I ran some tests comparing the response times of tdbsam to ldapsam when using NT
usermgr.  I used the same machine, a dual CPU 500MHz Pentium II with 512MB of
memory on both tests.  I added some temporary debug in load_sampwd_entries
because that is where it seemed the most time was being spent.  The log level
was set to 0.

With tdbsam:

[2004/04/08 10:51:04.814008, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(231)
  load_sampwd_entries: Calling pdb_setsampwent
[2004/04/08 10:51:04.892743, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(236)
  load_sampwd_entries: Finished pdb_setsampwent
[2004/04/08 10:51:05.743398, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(279)
  load_sampwd_entries: pdb_endsampwent done


With ldapsam:

[2004/04/08 11:06:31.180032, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(231)
  load_sampwd_entries: Calling pdb_setsampwent
[2004/04/08 11:06:41.604726, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(236)
  load_sampwd_entries: Finished pdb_setsampwent
[2004/04/08 11:06:53.767643, 0] rpc_server/srv_samr_nt.c:load_sampwd_entries(279)
  load_sampwd_entries: pdb_endsampwent done

I think I can improve the speed of calling pdb_setsampwent by tuning my openldap
server but not the speed of iterating through the entries.  It looks to me like
the reason for the longer time to iterate through all the sampwd entries  is the
difference between the functions called by pdb_getsampwent in each backend.  My
guess is that init_sam_from_ldap in passdb/pdb_ldap.c is not as efficient as
init_sam_from_buffer_v1 in passdb/passdb.c
Comment 2 Gerald (Jerry) Carter (dead mail address) 2004-04-22 20:31:32 UTC
*** Bug 1243 has been marked as a duplicate of this bug. ***
Comment 3 John Janosik 2004-12-30 09:05:02 UTC
This problem was fixed in svn by Guenther Deschner.  He found there is a level 8
query_dom_info RPC that just returns the domain sequence number.
Comment 4 Gerald (Jerry) Carter (dead mail address) 2005-02-07 07:49:40 UTC
marking as fixed.  Originally reported against 3.0.pre1.  Fixed in 3.0.3
Comment 5 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:16:22 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.