Bug 7040 - Provisioning fails with directory backends
Provisioning fails with directory backends
Status: RESOLVED FIXED
Product: Samba 4.0
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB
unspecified
x86 Linux
: P3 major
: ---
Assigned To: Andrew Bartlett
samba4-qa@samba.org
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-01-13 13:16 UTC by Dirk Pauli
Modified: 2010-03-19 17:59 UTC (History)
4 users (show)

See Also:


Attachments
Patch #1 (1.37 KB, patch)
2010-01-25 10:22 UTC, Endi Sukma Dewata
no flags Details
Patch #2 (2.21 KB, patch)
2010-01-25 10:23 UTC, Endi Sukma Dewata
no flags Details
Patch #3 (858 bytes, patch)
2010-01-25 10:23 UTC, Endi Sukma Dewata
no flags Details
Patch #4 (748 bytes, patch)
2010-01-25 13:20 UTC, Endi Sukma Dewata
no flags Details
Patch #5 (4.71 KB, patch)
2010-01-26 16:11 UTC, Endi Sukma Dewata
no flags Details
A fix proposal (at least for the SYSTEM control) (2.75 KB, patch)
2010-02-04 02:15 UTC, Matthias Dieter Wallnöfer
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Pauli 2010-01-13 13:16:17 UTC
Using Ubuntu 9.10 (x86), I set up a samba4 alpha11 server (sources from git.samba.org using release-4-0-0alpha11 tag) with OpenLDAP-Backend (v2.4.21). when trying to provision, this fails every time with the following error message:

> sudo ./setup/provision --realm=TEST.LOCAL --domain=TEST --server-role='domain controller' --ldap-backend-type=openldap --username=samba-admin --password=PasswordChanged --adminpass=PasswordChanged --ldapadminpass=PasswordChanged --slapd-path='/usr/local/libexec/slapd' 

Failed to bind - LDAP client internal error: NT_STATUS_UNEXPECTED_NETWORK_ERROR
Failed to connect to 'ldapi://%2Fusr%2Flocal%2Fsamba%2Fprivate%2Fldap%2Fldapi'
Setting up secrets.ldb
Setting up the registry
Setting up the privileges database
Setting up idmap db
Setting up SAM db
Setting up sam.ldb partitions and settings
Setting up sam.ldb rootDSE
Pre-loading the Samba 4 and AD schema
Adding DomainDN: DC=test,DC=local
pdc_fsmo_init: no domain object present: (skip loading of domain details)

Traceback (most recent call last):
  File "./setup/provision", line 222, in <module>
    nosync=opts.nosync,ldap_dryrun_mode=opts.ldap_dryrun_mode)
  File "bin/python/samba/provision.py", line 1240, in provision
    dom_for_fun_level=dom_for_fun_level)
  File "bin/python/samba/provision.py", line 927, in setup_samdb
    "SAMBA_VERSION_STRING": version
  File "bin/python/samba/provision.py", line 265, in setup_modify_ldif
    ldb.modify_ldif(data)
  File "bin/python/samba/__init__.py", line 261, in modify_ldif
    self.modify(msg, controls)
_ldb.LdbError: (1, 'LDAP client internal error: NT_STATUS_INTERNAL_ERROR')
A transaction is still active in ldb context [0xb3ff0d8] on /usr/local/samba/private/secrets.ldb
Comment 1 Matthias Dieter Wallnöfer 2010-01-14 04:53:22 UTC
Endi, since you are an LDAP backend expert of s4, do you have some time to investigate this bug? We think that it's only related to backends (maybe also 389 directory server).
Comment 2 Endi Sukma Dewata 2010-01-15 14:38:32 UTC
Hi, I noticed this happens with 389 DS too using a recent Samba build. I will investigate this.
Comment 3 Matthias Dieter Wallnöfer 2010-01-24 04:36:27 UTC
I just pushed three fixes regarding this kindly provided by Endi. I hope he is willing to help us also with the last issue pointed out on the mailing list.
Comment 4 Endi Sukma Dewata 2010-01-25 10:22:36 UTC
Created attachment 5216 [details]
Patch #1
Comment 5 Endi Sukma Dewata 2010-01-25 10:23:26 UTC
Created attachment 5217 [details]
Patch #2
Comment 6 Endi Sukma Dewata 2010-01-25 10:23:53 UTC
Created attachment 5218 [details]
Patch #3
Comment 7 Endi Sukma Dewata 2010-01-25 10:28:01 UTC
Hi, I added the 3 patches into this bug for reference. I will address the additional issues mentioned by Andrew in a separate patch. Thanks.
Comment 8 Dirk Pauli 2010-01-25 11:56:09 UTC
Hi Endi,
thanks. After applying the patches, I can provision.

Regards,
  Dirk
Comment 9 Endi Sukma Dewata 2010-01-25 13:20:41 UTC
Created attachment 5219 [details]
Patch #4
Comment 10 Endi Sukma Dewata 2010-01-25 13:23:10 UTC
Hi Dirk,

It turns out the patches that have been accepted were patch #2-4. Patch #1 did fix the problem, but I'm currently working to replace it with a different mechanism.
Comment 11 Matthias Dieter Wallnöfer 2010-01-25 15:10:24 UTC
Comment on attachment 5217 [details]
Patch #2

Applied
Comment 12 Matthias Dieter Wallnöfer 2010-01-25 15:10:41 UTC
Comment on attachment 5218 [details]
Patch #3

Applied
Comment 13 Matthias Dieter Wallnöfer 2010-01-25 15:10:57 UTC
Comment on attachment 5219 [details]
Patch #4

Applied
Comment 14 Matthias Dieter Wallnöfer 2010-01-25 15:14:54 UTC
I as QA person wouldn't say fully fixed since you (Dirk) probably made also use of patch 1. This, as Endi pointed out, wasn't accepted and an improved version will make it into our source tree. Only then this issue can be closed as "FIXED".
Comment 15 Endi Sukma Dewata 2010-01-26 16:11:17 UTC
Created attachment 5223 [details]
Patch #5

Patch #5 replaces patch #1 by removing the LDB_CONTROL_AS_SYSTEM_OID and DSDB_CONTROL_DN_STORAGE_FORMAT_OID controls after being used by the LDB modules.
Comment 16 Dirk Pauli 2010-01-29 12:49:43 UTC
with installing only patches 2-5, the provisioning process works for me.
I have strange effects after joining with a Win XP Sp3 client and browsing the AD, though: After a few clicks (opening / closing the structures), I get the error message "The Server is not responding". This can be temporarely overcome by restarting the samba daemon. 
Not sure whether this is dependand to this bug or just another one.
Comment 17 Matthias Dieter Wallnöfer 2010-01-31 04:11:19 UTC
Ah, didn't see - both bugs (7040, 7042) were reported by you :) . Let us see what we can do. I change the title of this to "Provisioning fails with directory backends" - this is more expressive.

It would be nice if tridge could comment on the remaining patch.
Comment 18 Matthias Dieter Wallnöfer 2010-02-04 02:14:36 UTC
Comment on attachment 5223 [details]
Patch #5

Endi, tridge and I discussed your patch. Basically is it right to look for all places where we perform a lookup for the SYSTEM control.
But then you shouldn't try to remove the control itself, but simply specify <control object>->critical = 0; (<control object> returned by "ldb_request_get_control" or after creation).
Comment 19 Matthias Dieter Wallnöfer 2010-02-04 02:15:42 UTC
Created attachment 5270 [details]
A fix proposal (at least for the SYSTEM control)
Comment 20 Matthias Dieter Wallnöfer 2010-02-04 04:11:18 UTC
I pushed my fix proposal after some testing to "master".
Dirk, could you retest the provision using a recent "master" source tree without Endi's patches?
If that works we are done. Otherwise we will have to fix more.
Comment 21 Dirk Pauli 2010-02-04 12:47:09 UTC
Hi Matthias,
checked out heads/master (which was 204e4b2 at that time) and recompiled as requested.
Unfortunately, I cannot provision then:

sudo ./setup/provision \
> --realm=test.LOCAL --domain=test \
> --server-role='domain controller' \
> --ldap-backend-type=openldap \
> --username=samba-admin \
> --password=PWChanged \
> --adminpass=PWChanged \
> --ldapadminpass=PWChanged \
> --slapd-path='/usr/local/libexec/slapd' 
hdb_db_open: database "cn=Schema,cn=Configuration,dc=test,dc=local": db_o
pen(/usr/local/samba/private/ldap/db/schema/id2entry.bdb) failed: No such file o
r directory (2).
backend_startup_one (type=hdb, suffix="cn=Schema,cn=Configuration,dc=test
,dc=local"): bi_db_open failed! (2)
slap_startup failed (test would succeed using the -u switch)
Failed to bind - LDAP client internal error: NT_STATUS_UNEXPECTED_NETWORK_ERROR
Failed to connect to 'ldapi://%2Fusr%2Flocal%2Fsamba%2Fprivate%2Fldap%2Fldapi'
Setting up share.ldb
Setting up secrets.ldb
Setting up the registry
Setting up the privileges database
Setting up idmap db
Setting up SAM db
Setting up sam.ldb partitions and settings
Setting up sam.ldb rootDSE
Pre-loading the Samba 4 and AD schema
Adding DomainDN: DC=test,DC=local
pdc_fsmo_init: no domain object present: (skip loading of domain details)

Traceback (most recent call last):
  File "./setup/provision", line 222, in <module>
    nosync=opts.nosync,ldap_dryrun_mode=opts.ldap_dryrun_mode)
  File "bin/python/samba/provision.py", line 1240, in provision
    dom_for_fun_level=dom_for_fun_level)
  File "bin/python/samba/provision.py", line 927, in setup_samdb
    "SAMBA_VERSION_STRING": version
  File "bin/python/samba/provision.py", line 265, in setup_modify_ldif
    ldb.modify_ldif(data)
  File "bin/python/samba/__init__.py", line 261, in modify_ldif
    self.modify(msg, controls)
_ldb.LdbError: (1, 'LDAP client internal error: NT_STATUS_INTERNAL_ERROR')
A transaction is still active in ldb context [0xafa53d8] on /usr/local/samba/pri
vate/secrets.ldb
Comment 22 Endi Sukma Dewata 2010-02-04 14:01:39 UTC
Just for the record, this is Andrew Bartlett's that was posted on
samba-technical on Jan 22, 2010:

> We should not have a network implementation of LDB_CONTROL_AS_SYSTEM_OID
> - for security this should never be accepted over LDAP.  

> On further reflection:  A patch would be accepted that ensures this
> remains true.  To fix the original bug, the ACL modules need to be
> modified to swallow the control, like I discuss here:

> We should also figure out what is causing
> DSDB_CONTROL_DN_STORAGE_FORMAT_OID to get to the backend, without being
> intercepted and interpreted by the extended_dn_out module. 
Comment 23 Endi Sukma Dewata 2010-02-04 14:19:27 UTC
I think setting the control to not critical is insufficient. Please take a look at ldap_encode_control() in libcli/ldap/ldap_message.c:

// check all known controls
for (i = 0; handlers[i].oid != NULL; i++) {

    // if the control is registered
    if (strcmp(handlers[i].oid, ctrl->oid) == 0) {
        
        // if no encoder function is defined for the control
        if (!handlers[i].encode) {

            // if the control is critical return false (error)
            if (ctrl->critical) {
                return false;

            } else { // if not critical don't encode the control (skip)
                return true;
            }
        }

        ... encode the control ...

        break;
    }
}

// if the control is not registered return false (error)
if (handlers[i].oid == NULL) {
    return false;
}

So for this to work, the control would have to be registered as in patch #1
which was rejected.
Comment 24 Andrew Bartlett 2010-02-09 16:10:35 UTC
So, where are we at with regard to swallowing the internal controls?  What internal-only controls are still being sent to the backend?

It seems reasonable to allow the LDAP client code to ignore non-critical controls it does not know how to parse, but it still does not fill me with delight.  (As I really don't like silent discards, particularly on the client which we control). 
Comment 25 Endi Sukma Dewata 2010-02-09 16:57:10 UTC
Hi Andrew, I believe we are still trying to remove LDB_CONTROL_AS_SYSTEM_OID and
DSDB_CONTROL_DN_STORAGE_FORMAT_OID controls. Could you take a look at patch #5? Is that what you meant by "swallowing the internal controls"?
Comment 26 Andrew Bartlett 2010-02-09 23:48:33 UTC
Yes, patch 5 is the approach I was expecting. 

Operating LDB in 'trace mode' (difficult in provision due to silly global varaible issues) is a good way to figure out what is going wrong.  I suggest you hack the source to enable it unconditionally, and see why the control is not stripped. 
Comment 27 Endi Sukma Dewata 2010-02-10 15:53:59 UTC
Hi Andrew, I have traced and looked at the code, these controls are only used in the acl and extended_dn_out modules. Currently there is nothing in the code that will remove the controls. So in patch #5 I'm doing that by by calling save_controls() in those modules. Do you mean that the controls should have been removed somewhere else? Please note that after applying patch #5 the problem no longer appears.
Comment 28 Andrew Bartlett 2010-02-10 18:51:14 UTC
My view is that patch #5 is the correct approach.  I'll talk to tridge about his objections and the options. 
Comment 29 Endi Sukma Dewata 2010-02-21 22:26:46 UTC
Hi, do we have a solution for this bug? Thanks.
Comment 30 Andrew Bartlett 2010-02-22 06:03:31 UTC
OK, I've finally talked to tridge, who has good reasons not to like save_controls().  Therefore, the right way to deal with this is to make the ldap client code ignore non-critical controls.

We should, before we add parsers for as_system in particular, add code to the Samba4 LDAP server to ensure that all unadvertised controls are stripped when in the incoming request (and denied if critical)
Comment 31 Endi Sukma Dewata 2010-02-27 13:44:10 UTC
I'd like to resubmit Patch #1. I have applied this patch against revision 1933214108d1a71bc6473a696ce35020a427d8f4 and I was able to complete the quick tests with the default backend, FDS, and OpenLDAP. It also has been tested by Dirk in comment #8.

I think Patch #1 should be sufficient to close this bug because it fixed the reported problem. As mentioned in comment #23, the LDAP client code can already ignore non-critical controls as long as the controls are registered with NULL handlers, which is what Patch #1 is doing. Any other changes would be addressed separately from this bug.
Comment 32 Andrew Bartlett 2010-03-02 01:36:40 UTC
I've merged Endi's patches, including patch #1.

I do wish to apologise for the length of time it has taken to resolve this issue. 
Comment 33 Dirk Pauli 2010-03-02 13:47:33 UTC
Hi,
sorry to say that, BUT:
- synced to head (=release-4-0-0alpha11-66-g204e4b2), so Endi's patches all should be included here
- recompiled, reinstalled
- provision still / again (?) fails:
sudo ./setup/provision \
> --realm=TEST.LOCAL --domain=TEST \
> --server-role='domain controller' \
> --ldap-backend-type=openldap \
> --username=samba-admin \
> --password=PWChanged \
> --adminpass=PWChanged \
> --ldapadminpass=PWChanged \
> --slapd-path='/usr/local/libexec/slapd' 
hdb_db_open: database "cn=Schema,cn=Configuration,dc=TEST,dc=local": db_open(/usr/local/samba/private/ldap/db/schema/id2entry.bdb) failed: No such file or directory (2).
backend_startup_one (type=hdb, suffix="cn=Schema,cn=Configuration,dc=TEST,dc=local"): bi_db_open failed! (2)
slap_startup failed (test would succeed using the -u switch)
Failed to bind - LDAP client internal error: NT_STATUS_UNEXPECTED_NETWORK_ERROR
Failed to connect to 'ldapi://%2Fusr%2Flocal%2Fsamba%2Fprivate%2Fldap%2Fldapi'
Setting up share.ldb
Setting up secrets.ldb
Setting up the registry
Setting up the privileges database
Setting up idmap db
Setting up SAM db
Setting up sam.ldb partitions and settings
Setting up sam.ldb rootDSE
Pre-loading the Samba 4 and AD schema
Adding DomainDN: DC=TEST,DC=local
pdc_fsmo_init: no domain object present: (skip loading of domain details)

Traceback (most recent call last):
  File "./setup/provision", line 222, in <module>
    nosync=opts.nosync,ldap_dryrun_mode=opts.ldap_dryrun_mode)
  File "bin/python/samba/provision.py", line 1240, in provision
    dom_for_fun_level=dom_for_fun_level)
  File "bin/python/samba/provision.py", line 927, in setup_samdb
    "SAMBA_VERSION_STRING": version
  File "bin/python/samba/provision.py", line 265, in setup_modify_ldif
    ldb.modify_ldif(data)
  File "bin/python/samba/__init__.py", line 261, in modify_ldif
    self.modify(msg, controls)
_ldb.LdbError: (1, 'LDAP client internal error: NT_STATUS_INTERNAL_ERROR')
A transaction is still active in ldb context [0xa1a23d8] on /usr/local/samba/private/secrets.ldb

Regards,
  Dirk
Comment 34 Dirk Pauli 2010-03-11 12:03:06 UTC
Hey Guys, is there anything I can do to help you finding the difference between Endies patch against alpha 11 which weorked and the current state of the mainline branch where the fix seem not to show its salvation?
From my view, we are a tiny step before the final solution - and I love things to be closed finally.

Regards,
  Dirk
Comment 35 Mike Lyon 2010-03-18 08:11:12 UTC
I ran into this using either alpha11 or tp2:

Administrator password will be set randomly!
Traceback (most recent call last):
  File "setup/provision", line 222, in <module>
    nosync=opts.nosync,ldap_dryrun_mode=opts.ldap_dryrun_mode)
  File "bin/python/samba/provision.py", line 1201, in provision
    provision_backend.init()
  File "bin/python/samba/provisionbackend.py", line 190, in init
    raise ProvisioningError("Warning: LDAP-Backend must be setup with path to slapd, e.g. --slapd-path=\"/usr/local/libexec/slapd\"!")
samba.provisionexceptions.ProvisioningError: Warning: LDAP-Backend must be setup with path to slapd, e.g. --slapd-path="/usr/local/libexec/slapd"!

The instructions in the wiki (http://wiki.samba.org/index.php/Samba4/LDAP_Backend/OpenLDAP) point to using:

setup/provision-backend --realm=ldap.samba.example.com --ldap-admin-pass=penguin --ldap-backend-type=openldap 

To provision the backend, but provision-backend is not compiled.

I was trying to compile Samba4 to use an OpenLDAP directory that is an instance of OpenDirectory running on Apple SnowLeopard Server.
Comment 36 Andrew Bartlett 2010-03-19 02:11:38 UTC
Mike: 

This bug isn't about your wanting to use Samba4's LDAP backend against an existing directory.  

Please raise on samba-technical (where I will try and explain the limitations).  In short, against an existing OpenDirectory just isn't possible at the moment. 
Comment 37 Andrew Bartlett 2010-03-19 17:59:39 UTC
As far as I can tell, this is fixed.

I've tested with 6de83ef6277d8506478ce5ff43d33e39541b310c and OpenLDAP CVS HEAD (from this week). 

Please open any new regressions or issues in a new bug.