Bug 15312 - samba-tool ldapcmp attribute compare should not be case sensitive and replication should not change attribute case
Summary: samba-tool ldapcmp attribute compare should not be case sensitive and replica...
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: 4.16.6
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
Depends on:
Reported: 2023-02-16 14:36 UTC by keesvanvloten
Modified: 2023-02-17 10:17 UTC (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description keesvanvloten 2023-02-16 14:36:44 UTC
On Bullseye with Samba 4.16.6 installed on both DCs.
When I run:

samba-tool ldapcmp --filter=whenChanged,dc,DC,cn,CN ldap://dc01.example.com ldap://dc02.example.com -d0

It returns a series of errors:


* Objects to be compared: 1774

    Difference in attribute values:
        mayContain => 
[b'inventorylist', b'inventoryurl']
[b'inventoryList', b'inventoryURL']


[...more errors, redacted...]

    Difference in attribute values:
        mayContain => 


* Result for [SCHEMA]: FAILURE

All of these are caused by the Sogo calendar schema additions, which probably does not have the attributes completely aligned with the naming convention. These changes were applied to dc01, then replicated to dc02.

Nevertheless LDAP attributes are not case-sensitive and hence should not be compared case-sensitive.
This MS forum thread explains exactly that: https://social.technet.microsoft.com/Forums/lync/en-US/39305ac4-b855-4113-80ee-be7324e4fa97/is-ldap-attribute-are-case-sensitive-?forum=winserverDS

- Please fix the comparison in ldapcmp (and perhaps elsewhere in Samba?).
- Please check replication. It does not replicate the exact attributes, instead it applies the naming convention during replication and that is the cause of the difference shown here.

Related question: how could I fix dc01 so that the attribute names that are reported as wrong are updated to the right (case) spelling.

- Kees.

(side note: my name is pronounced in Dutch as 'case' :-) )
Comment 1 Douglas Bagnall 2023-02-16 19:32:37 UTC
ldapcmp does most comparisons case-insensitively. 

I don't think Samba will be normalising the case of "inventoryList", "inventoryURL", and "admittanceURL", because it normally has no knowledge of these attributes, so can't know how they should look.

Is it possible that something else is changing them? Are these camel case versions somewhere in the schema that adds them?
Comment 2 keesvanvloten 2023-02-16 19:45:23 UTC
I did a quick grep through the schema files:

schema-sogo-calendar-resource-1.ldif.j2:dn: CN=inventoryURL,CN=Schema,CN=Configuration,DC=example,DC=com
schema-sogo-calendar-resource-1.ldif.j2:cn: inventoryURL
schema-sogo-calendar-resource-1.ldif.j2:name: inventoryURL
schema-sogo-calendar-resource-1.ldif.j2:lDAPDisplayName: inventoryURL
schema-sogo-calendar-resource-3.ldif.j2:mayContain: inventoryurl

At least the schema is not very consistent by itself.

Even then, the schema changes are applied (just once) to dc01. 
Perhaps the replication checks "mayContain" with "cn"?

There is no other way then the replication to get this to dc02, or is there?
Comment 3 Douglas Bagnall 2023-02-16 22:50:53 UTC
The thing is, these are *not* attribute names in this context. 

The attribute name is "mayContain", which will be compared case-insensitively.

ldapcmp doesn't know that the values of this attribute should also be treated as attribute names. To it, they are just values.

I think there are some attributes (like "cn") where we do compare ignore case in the values. But in the general case we can't.

So we could put "mayContain" in a list making it case-insensitive.

> Perhaps the replication checks "mayContain" with "cn"?

Yeah. or schema ingestion (and possibly "ldapdisplayname" rather than "cn"). I am not convinced this is really a bug.

We should probably add a heuristic to ldapcmp to ignore case in "mayContain". This will be a minor problem if people ever use the attribute in other contexts, but ldapcmp is already a complete mess of hacks and assumptions so it will blend in.
Comment 4 keesvanvloten 2023-02-16 23:16:35 UTC
> We should probably add a heuristic to ldapcmp to ignore case in "mayContain"

This is manually possible with the --filter option, or would that not work?

I did check the complete list of errors reported by ldapcmp and indeed all of them are case issues with "mayContain".

But I am still completely puzzled how I got to these differences.
I manage 3 domains at 2 customers + my own (i.e. not connected to each other). I use the same Ansible code to manage all 3 three of them, manual changes are never made. 
2 domains show these minor differences (on both exactly the same diffs). Since the same code runs everywhere, it is unlikely that that has caused the difference. 
The second DC in any of the environments is only touched by replication, but if that cannot be the cause, then the magic that has caused this is not yet explained :-(

Is there a way to get rid of these "mayContain" diffs? If it is a one-time action to set the case correctly I can do that. At least Louis' check-replication script (which does the ldapcmp) will stop sending me daily emails about this "replication failure".
Comment 5 Douglas Bagnall 2023-02-17 01:39:53 UTC
(In reply to keesvanvloten from comment #4)
yes --filter-ing out mayContain should work.

Per https://learn.microsoft.com/en-us/windows/win32/adschema/a-maycontain mayContain is not actually a string attribute, it is an OID attribute. It is stored as a string in Samba's ldb, but is converted into an OID for replication. The recipient looks up the name belonging to the OID in the schema[1]. The ldapDisplayName in the schema has mixed case and this is what it uses.

dc01 doesn't go through this convert to OID, look up OID, convert to string process. It just stores the string it was given. Arguably it should, to ensure that the name is valid.

Also ldapcmp should ignore case in attributes like this, and the schema should really try to be consistent with itself.

[1] around here source4/dsdb/schema/schema_syntax.c:1351
Comment 6 keesvanvloten 2023-02-17 10:17:12 UTC
Thanks for looking into it, Douglas.

I can confirm the --filter helps to hide this diff. That solves my immediate issue (of false alarms in daily mail-reports).

> converted into an OID for replication.
That would mean if it replicates to itself all conversions take place and the final spelling of the values of attributes like these is reached. 
This "self-replication" is only required after a schema change, I guess. If there was an option in samba-tool I can trigger it when required.
After the one-time conversion attributes on both sides are identical and ldapcmp will not find errors.