Bug 1186 - Repeated Failed Attempts to create user when AD down
Summary: Repeated Failed Attempts to create user when AD down
Status: RESOLVED LATER
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: winbind (show other bugs)
Version: 3.0.1
Hardware: All Solaris
: P2 major
Target Milestone: none
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact:
URL:
Keywords:
Depends on: 1185
Blocks:
  Show dependency treegraph
 
Reported: 2004-03-12 12:47 UTC by John Klinger
Modified: 2005-11-14 09:25 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Klinger 2004-03-12 12:47:32 UTC
I've stumbled on a problem using Samba "security=ads" when the AD is down.
Basically, winbindd attempt to create users on it own, wasting reams of uids in
the process. The following is a reliable way to produce the problem.

1) Start a Samba server with security=ads, an OpenLDAP idmap backend, and a
"globalshare" share defined.
2) With everything running as it should, shut down your AD machine.
3) On a PC, attempt to access the globalshare: (eg: Start / Run / \\globalshare
)

A level 10 winbindd log will show an attempt to access the share by the PC user,
several failures to authenticate the user, then an attempt to create the user in
Samba. Since we are using AD for user accounts, the creation attempt fails.
However, the creation attempt still uses up the next available UID. This happens
*quickly*, eating about 30 UIDs a second.

Following is the text from my initial Samba newsgroup post:

First I'll give an abbreviated description of what I'm about to describe.
I hope this will make the next paragraph easier to follow:

with user testuser1 and domain TST:

0) client tries to access a global share
1) client GETPWNAM    request for tst+testuser1 to the Samba server
2) client GETPWNAM    request for TST+testuser1 to the Samba server
3) client GETPWNAM    request for TST+TESTUSER1 to the Samba server
4) client GETPWNAM    request for testuser1     to the Samba server
5) client GETPWNAM    request for TESTUSER1     to the Samba server
6) client CREATE_USER request for testuser1, which returns success and
   uses up a uid.
7) Process repeats since the winbindd server still can't authenticate
   using the AD -- including the CREATE_USER.

----

Now to elaborate:

There have been situations where a PC tries to access a Samba share.  In this
case, the user is an AD global user that is not yet in the openLDAP idmap
backend. I expect that even if it was in the idmap, the problem would still
exist.

At the time of these client requests, winbindd is up and running on the Samba
server, but received that Kerberos error above (we are at step 2 of the startup
loop, so winbindd can't talk with AD yet). The client tries five variations of
the user name, attempting to authenticate using the winbindd server. This
failing, it sends a CREATE_USER request to the winbindd server. The winbindd
server pulls the next available id from the openLDAP backend, incrementing the
openLDAP's next-available stored in uidNumber (<= issue!!), then calls
wb_storepwnam.  wb_storepwnam returns success (hmmmm, recall that we
authenticate to AD and don't have a working connection to AD). The client then
repeats this cycle, requesting authentication, then doing another CREATE_USER. 
This cycle is repeated roughly 30 times per second.

The very bad part of this is that while winbindd is up and can't connect to AD,
30 uids are lost every second since the CREATE_USER request increments the
openLDAP uidNumber every time it is called. We have a limited uid range, so this
is very painful for us. If we leave winbindd running 30 seconds only to find out
it didn't connect to the AD, we've lost 900 uids in the interim.

================================

Here is the section of the log.winbindd that is repeated 30 times a second, each
time with a new, consecutive uid in the wb_storepwnam output. The timestamp has
been removed for brevity and were all "2004/02/12 18:00:20".


[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:client_write(556)
  client_write: need to write 512 extra data bytes.
[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 512 bytes.
[10] nsswitch/winbindd.c:client_write(545)
  client_write: client_write: complete response written.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn GETPWNAM
[3] nsswitch/winbindd_user.c:winbindd_getpwnam(113)
  [ 5263]: getpwnam tst+testuser1
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(343)
  refresh_sequence_number: TST time ok
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(368)
  refresh_sequence_number: TST seq number is now -1
[1] nsswitch/winbindd_user.c:winbindd_getpwnam(148)
  user 'testuser1' does not exist

[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn GETPWNAM
[3] nsswitch/winbindd_user.c:winbindd_getpwnam(113)
  [ 5263]: getpwnam TST+testuser1
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(343)
  refresh_sequence_number: TST time ok
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(368)
  refresh_sequence_number: TST seq number is now -1
[1] nsswitch/winbindd_user.c:winbindd_getpwnam(148)
  user 'testuser1' does not exist

[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn GETPWNAM
[3] nsswitch/winbindd_user.c:winbindd_getpwnam(113)
  [ 5263]: getpwnam TST+TESTUSER1
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(343)
  refresh_sequence_number: TST time ok
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(368)
  refresh_sequence_number: TST seq number is now -1
[1] nsswitch/winbindd_user.c:winbindd_getpwnam(148)
  user 'TESTUSER1' does not exist

[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn GETPWNAM
[3] nsswitch/winbindd_user.c:winbindd_getpwnam(113)
  [ 5263]: getpwnam testuser1
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(343)
  refresh_sequence_number: TST time ok
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(368)
  refresh_sequence_number: TST seq number is now -1
[1] nsswitch/winbindd_user.c:winbindd_getpwnam(148)
  user 'testuser1' does not exist

[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn GETPWNAM
[3] nsswitch/winbindd_user.c:winbindd_getpwnam(113)
  [ 5263]: getpwnam TESTUSER1
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(343)
  refresh_sequence_number: TST time ok
[10] nsswitch/winbindd_cache.c:refresh_sequence_number(368)
  refresh_sequence_number: TST seq number is now -1
[1] nsswitch/winbindd_user.c:winbindd_getpwnam(148)
  user 'TESTUSER1' does not exist

[10] nsswitch/winbindd.c:client_write(511)
  client_write: wrote 1300 bytes.
[10] nsswitch/winbindd.c:winbind_client_read(457)
  client_read: read 1568 bytes. Need 0 more for a full request.
[10] nsswitch/winbindd.c:process_request(307)
  process_request: request fn CREATE_USER
[3] nsswitch/winbindd_acct.c:winbindd_create_user(876)
  [ 5263]: create_user: user=>(testuser1), group=>()
[5] nsswitch/winbindd_acct.c:wb_getgrnam(522)
  wb_getgrnam: Did not find group (nobody)
[10] nsswitch/winbindd_acct.c:passwd2string(158)
  passwd2string: converting passwd struct for testuser1
[10] nsswitch/winbindd_acct.c:wb_storepwnam(487)
  wb_storepwnam: Success ->
"testuser1:x:11144:60001:testuser1:/home1/testuser1:/bin/csh"

=========

Although the last line shows success, the user is never created, so the entire
error loop complete with authentications and user creation attempt is repeated
until Windows gives up its attempt to access the share.
Comment 1 John Klinger 2004-03-16 10:32:05 UTC
Tracing through the code, it appears this is due to the "winbind enable local 
accounts" smb.conf option defaulting to True regardless of using AD or not. 
Turning this option off will prevent the winbind client from requesting the 
user creation (winbind_create_user). I'll run tests today or tomorrow to 
confirm.

A patch could be created to disable this option when "security=ads", or it 
could be a manual process of setting the option to "no" in the smb.conf. I 
prefer the former, since even when the user is created in the tdb, the 
authentication is still be performed against the AD, making the tdb user 
creation useless. I would also consider this a security issue if any function 
uses that newly-created tdb user, since all authentication should be through 
the AD.
Comment 2 Gerald (Jerry) Carter (dead mail address) 2004-03-16 11:10:36 UTC
This is not a security issue.  'winbind local accounts'
are really just a type of uid mapping method.  They's 
just local users and/or groups.  Think of them as 
portable 'add user scripts'

While the parameter should probably default to 'no', the 
solution for 3.0 will be to manual disable this parameter.
Comment 3 John Klinger 2004-03-23 08:11:30 UTC
Just for reference: Based on the last comment, the idmap backend setting, not 
the security setting, drives whether the "winbind enable local accounts" option 
should be on or off. If using a non-tdb idmap backend (e.g. ldap), this option 
should be set to off in the smb.conf. The "winbind enable local accounts" 
option only writes into the tdb.
Comment 4 Gerald (Jerry) Carter (dead mail address) 2005-11-14 09:25:00 UTC
database cleanup