Bug 13616 - Error running "samba-tool dbcheck" after upgrading from 4.8.5 to 4.9.0
Summary: Error running "samba-tool dbcheck" after upgrading from 4.8.5 to 4.9.0
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Python (show other bugs)
Version: 4.9.4
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Noel Power
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-13 23:59 UTC by Miguel Medalha
Modified: 2019-03-14 00:13 UTC (History)
6 users (show)

See Also:


Attachments
Patch to remove utf-8 decode from dbchecker.py (1.68 KB, patch)
2018-09-19 12:48 UTC, Thomas Rosenstein
no flags Details
backport of patch to 4.9 (5.95 KB, patch)
2018-09-28 08:29 UTC, Noel Power
no flags Details
patch that implements fix (& test) (5.89 KB, patch)
2018-09-28 08:31 UTC, Noel Power
no flags Details
backport of patches to 4.9 (including version bump) (6.52 KB, patch)
2018-09-28 14:17 UTC, Noel Power
no flags Details
backport of patches to 4.9 with corrected version bump patch (29.70 KB, patch)
2018-11-04 23:41 UTC, Andrew Bartlett
gary: review+
Details
proposed fix for master (7.66 KB, patch)
2018-11-13 12:25 UTC, Noel Power
no flags Details
partial fix for 4.9 (missing ldb versioning related pieces) (6.95 KB, patch)
2018-11-13 12:26 UTC, Noel Power
no flags Details
proposed new fix for master (7.64 KB, patch)
2018-11-13 13:01 UTC, Noel Power
no flags Details
proposed new partial fix for 4.9 (6.38 KB, patch)
2018-11-13 13:02 UTC, Noel Power
no flags Details
follow patch (master version) (13.37 KB, patch)
2019-01-14 12:15 UTC, Noel Power
no flags Details
follow patch 4.9 version) (33.31 KB, application/mbox)
2019-01-14 12:16 UTC, Noel Power
no flags Details
follow patch 4.9 version (32.20 KB, patch)
2019-01-17 10:11 UTC, Noel Power
dbagnall: review+
Details
patch for comment 42 (885 bytes, patch)
2019-03-06 22:46 UTC, Douglas Bagnall
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Miguel Medalha 2018-09-13 23:59:11 UTC
Immediately after upgrading an AD DC from 4.8.5 to 4.9.0, running "samba-tool dbcheck" produced the following error message:


Checking 511 objects
ERROR(<type 'exceptions.UnicodeEncodeError'>): uncaught exception - 'ascii'
codec can't encode character u'\xe1' in position 133: ordinal not in
range(128)
  File
"/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/__init__.py",
line 177, in _run
    return self.run(*args, **kwargs)
  File
"/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/dbcheck.py",
line 157, in run
    controls=controls, attrs=attrs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py",
line 221, in check_database
    error_count += self.check_object(object.dn, attrs=attrs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py",
line 2230, in check_object
    error_count += self.check_dn(obj, attrname, syntax_oid)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py",
line 1182, in check_dn
    dsdb_dn = dsdb_Dn(self.samdb, val.decode('utf8'), syntax_oid)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/common.py",
line 101, in __init__
    self.dn = ldb.Dn(samdb, self.dnstring)


The error seems related to handling of Unicode characters. After going back to Samba 4.8.5 the error disapears.

The underlying OS was CentOS 7.4.

I am not sure about the severity of this bug. I classified it as major only because (at least in my case) it prevented the use of samba 4.9.0 as AD DC.
Comment 1 Thomas Rosenstein 2018-09-19 12:47:11 UTC
Based on my testing this only happens if you have DNs which are saved as Base64 and contain therefore utf-8 data. (e.g. polish names)

I fixed it locally with the added patch.

The offending commit is this one: https://git.samba.org/?p=samba.git;a=commitdiff;h=13280d3db9fdbbd87c6ed2faa69b24bc7660674d
Comment 2 Thomas Rosenstein 2018-09-19 12:48:10 UTC
Created attachment 14493 [details]
Patch to remove utf-8 decode from dbchecker.py
Comment 3 Andrew Bartlett 2018-09-20 05:11:40 UTC
Noel,

Can you look into this, add tests etc?

Thanks,
Comment 4 Noel Power 2018-09-24 10:06:36 UTC
(In reply to Andrew Bartlett from comment #3)
Hmm, it seems what is happening here is that by default data in python2 passed to dsdb_Dn is now passed as unicode whereas previously it was passed as 'str'.

c python interface for ldb.Dn uses 's' format for the dn string passed to it, in this case python2 'str' is treated more like raw bytes and unicode is subject to handling by the default encoder e.g. ('ascii')

I am assuming that the previous change to ensure the second param to dsdb_Dn is correct, that it is assumed a string value is expected, that it is in utf8 format. https://msdn.microsoft.com/en-us/library/cc223180.aspx would seem to indicate it should be, Andrew ? do you agree ?

So it would appear that the c interface to ldb.Dn is the problem, it should accept not only byte string but also utf8 unicode. The alternative would be to adjust the code from the bulk commit below to be more forgiving (e.g. use the new python samba.compat get_string function) this would ensure that python2 code would be unaffected and only future python3 code should decode.
Comment 6 Noel Power 2018-09-28 08:29:14 UTC
Created attachment 14499 [details]
backport of patch to 4.9
Comment 7 Noel Power 2018-09-28 08:31:26 UTC
Created attachment 14500 [details]
patch that implements fix (& test)

Version of patch accepted into master (for comparison with backport)
Comment 8 Noel Power 2018-09-28 08:32:35 UTC
Comment on attachment 14493 [details]
Patch to remove utf-8 decode from dbchecker.py

superseded by newer version (only different in addition of tests)
Comment 9 Andrew Bartlett 2018-09-28 09:28:58 UTC
Comment on attachment 14499 [details]
backport of patch to 4.9

An ldb release needs to be made in the 4.9 series for this to be backported.

Just bump the version in the lib/ldb/wscript and then commit with the new ABI files (ensure you build with --extra-python).
Comment 10 Noel Power 2018-09-28 13:02:53 UTC
(In reply to Andrew Bartlett from comment #9)
thanks, just saw this, wasn't aware of this procedure, I keep forgetting things like ldb are sort of separate
Comment 11 Noel Power 2018-09-28 13:58:18 UTC
(In reply to Noel Power from comment #10)
hmm, I don't see any abi changes ? (not strict ones anyway) The python c interface hasn't changes (at least the build doesn't detect any change) sure, how the python c handles the input has changed (more forgiving, e.g. will accept ut8 encoded unicode)

Is there really a need to bump the ldb version for this ?
Comment 12 Noel Power 2018-09-28 14:17:35 UTC
Created attachment 14501 [details]
backport of patches to 4.9 (including version bump)

Not being sure of the procedure I also add a version here with ldb version bump (can obsolete whichever version when I know what is what)

Note: no abi change necessary
Comment 13 Andrew Bartlett 2018-11-04 23:41:55 UTC
Created attachment 14565 [details]
backport of patches to 4.9 with corrected version bump patch
Comment 14 Andrew Bartlett 2018-11-05 00:46:12 UTC
Karolin,

Please select for 4.9.next.

Thanks!

Andrew Bartlett
Comment 15 Karolin Seeger 2018-11-05 11:13:01 UTC
(In reply to Andrew Bartlett from comment #14)
Pushed to autobuild-v4-9-test.
Comment 16 Karolin Seeger 2018-11-06 08:12:35 UTC
(In reply to Karolin Seeger from comment #15)
Pushed to v4-9-test.
Closing out bug report.

Thanks!
Comment 17 Miguel Medalha 2018-11-08 21:33:50 UTC
Looks like the bug is still there with Samba 4.9.2:

Checking 511 objects
ERROR(<type 'exceptions.UnicodeDecodeError'>): uncaught exception - 'ascii' codec can't decode byte 0xc3 in position 25: ordinal not in range(128)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/__init__.py", line 177, in _run
    return self.run(*args, **kwargs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/dbcheck.py", line 157, in run
    controls=controls, attrs=attrs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py", line 222, in check_database
    error_count += self.check_object(object.dn, attrs=attrs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py", line 2315, in check_object
    expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (parent_dn))


The error disappears after reverting to Samba 4.8.6.
Comment 18 Noel Power 2018-11-13 12:23:50 UTC
I had a look at this, I replied with details also to a mail about the same (but not sure if that actually went anywhere as it at least bounced from some (or maybe all) recipients.

The gist of it is that the previous change to the ParseTuple format to 'es' seemed the correct thing to do. However there is a side-affect of doing this that wasn't immediately obvious to me from reading the documentation

with previous format 's' behaviour is as follows:

python2

(string) basically is passed through unencoded (this is probably bad but it is the default way python2 handles things)

(unicode) unicode is encoded to the default encoding ('ascii') (bad because we don't expect or want that)

python3

(unicode/string) is encoded to the default encoding ("utf8")

with new format we added as a 'es' (and specified encoding as 'utf8')

python2

(string) is re-encoded to 'utf8' (this is both unexpected and bad since effectively this operation will be string.decode('ascii').encode('utf8')) This first decode is the root cause of the current problem.

(unicode) is encoded to 'utf8' (this is good)

python3

(string/unicode) is encoded with the specified encoding ('utf8')

(bytes) will not be accepted

What can we do ?

a) we could ensure every call to Ldb.Dn passes unicode

   This would be alot of changes, and always the risk we miss some and also going forward everyone needs to remember when writing any code that they *need* to pass unicode. Seems unworkable

b) we could use the format 'et' which as far as I understand behaves as follows:

python2

(string) basically is passed through unencoded (same behaviour as old python2 code)

(unicode) encodes to specified format 'utf8' (this is good)

(bytes) will not be accepted

python3

(string/uncode) encodes to specified format ('utf8') in our case

(bytes) will be accepted (with the assumption the bytes contains bytes encoded in the specified encoding) This is might be problematic in that previously we would have gotten an error attempting to pass bytes (and also the bytes might not be encoded with the correct encoding)

Obviously b seems the saner option, but...  there is some risk with python3 (accepting the bytes)

I think the best option is a hybrid approach to use the 'et' format with python2 and the 'es' format with python3.

This is yet another attempt to fix the same problem again so I really would like another pair (or more) of eyes and/or opinions on this.

I'll attach master version of patch and also a 4.9 version (without the ldb version related bits), I'll add those if/when we are happy) And if there is consensus this is the correct approach we should change the other instances of use of 'es' with ParseTuple in the code too
Comment 19 Noel Power 2018-11-13 12:25:17 UTC
Created attachment 14650 [details]
proposed fix for master
Comment 20 Noel Power 2018-11-13 12:26:03 UTC
Created attachment 14651 [details]
partial fix for 4.9 (missing ldb versioning related pieces)
Comment 21 Noel Power 2018-11-13 12:50:32 UTC
Comment on attachment 14650 [details]
proposed fix for master

fails CI (due to some issues with test)
Comment 22 Noel Power 2018-11-13 12:51:14 UTC
Comment on attachment 14651 [details]
partial fix for 4.9 (missing ldb versioning related pieces)

issue with test
Comment 23 Noel Power 2018-11-13 13:01:51 UTC
Created attachment 14652 [details]
proposed new fix for master
Comment 24 Noel Power 2018-11-13 13:02:30 UTC
Created attachment 14653 [details]
proposed new partial fix for 4.9
Comment 25 Noel Power 2018-11-13 14:03:46 UTC
CI is running here https://gitlab.com/samba-team/devel/samba/pipelines/36415402
Comment 26 Krisztian Ivancso 2018-12-30 12:00:33 UTC
Hello,

I upgraded from 4.7.3 to 4.9.2 on Debian Buster and first I experienced internal DNS issue.

After some googling I found this bug.

I tried samba-tool dbcheck and it resulted in Python exception.

I fixed issue the same way as earlier patches.

After successful dbcheck and samba start, dns subsystem started and samba works as expected.

This is my patch:
--- /usr/lib/python2.7/dist-packages/samba/dbchecker.py 2018-12-29 22:12:17.058956118 +0100
+++ /tmp/dbchecker.py   2018-12-29 21:28:18.034954555 +0100
@@ -2312,7 +2312,7 @@
                     parent_dn = deleted_objects_dn
             if parent_dn is None:
                 parent_dn = obj.dn.parent()
-            expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (parent_dn))
+            expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (str(parent_dn).decode('utf8')))
             expected_dn.set_component(0, obj.dn.get_rdn_name(), name_val)
 
             if obj.dn == deleted_objects_dn:



Best regards,
ivan
Comment 27 Noel Power 2019-01-14 12:15:55 UTC
Created attachment 14768 [details]
follow patch (master version)
Comment 28 Noel Power 2019-01-14 12:16:40 UTC
Created attachment 14769 [details]
follow patch 4.9 version)

Patch for 4.9 included needed ldb version bump
Comment 29 Douglas Bagnall 2019-01-17 04:25:58 UTC
Comment on attachment 14769 [details]
follow patch 4.9 version)

Does the 4.9 patch need to bump the ldb version? None of the .sig files have changed from the 1.4.3 ones (and they shouldn't have). The Python-level API hasn't *really* changed has it? -- still taking str or unicode but failing in fewer cases.

Do we bump the version to trigger distro action?

In any case, the message is a bit lacking:

>  Python: Ensure ldb.Dn can doesn't rencoded str with py2 (bug 13616)


Otherwise RB+.
Comment 30 Noel Power 2019-01-17 09:33:53 UTC
(In reply to Douglas Bagnall from comment #29)
Hi Douglas,
Afaiu we need to trigger an ldb release in order to associate the ldb version with the samba version it needs to be released with (not sure if this actually ends up in the build dependencies somehow or is like you say it is a trigger for distros to make that dependency)

If we bump the version we need to provide the new sigs (even if the api hasn't *really* changed) see comment #9 and following for my similar questions to yours :-)

However I have to redo this patch, I just noticed that I include the patch to enable the test for py3 in the series. However, this does not work in 4.9. Well to be precise the test will work with py3 (if run manually) but the make test TEST=blah will not work properly due to some missing fixes to the some test related infrastructure code for py3. I will have to get you to rb again, sorry :-)
Comment 31 Noel Power 2019-01-17 10:11:40 UTC
Created attachment 14786 [details]
follow patch 4.9 version

Also added Bug tag to all the patches for completeness (bug tag was only on those patches that specifically addressed the bug and not the related tests)
Comment 32 Noel Power 2019-01-18 09:07:28 UTC
Re-assigning to Karolin for 4.9.next (and whatever else) p.s. just check that I needed to bump ldb version, I'm pretty sure this is needed but wanted to mention it anyway
Comment 33 Karolin Seeger 2019-01-18 10:57:10 UTC
(In reply to Noel Power from comment #32)
Thanks Noel!
Pushed both patches to autobuild-v4-9-test.
Comment 34 Karolin Seeger 2019-01-31 09:41:52 UTC
(In reply to Karolin Seeger from comment #33)
Pushed to v4-9-test.
Closing out bug report.

Thanks!
Comment 35 Miguel Medalha 2019-03-05 19:01:55 UTC
This bug is not fixed yet. I just installed samba 4.9.4 as AD DC and running "samba-tool db check" still produces the following error message:


ERROR(<type 'exceptions.UnicodeDecodeError'>): uncaught exception - 'ascii' codec can't decode byte 0xc3 in position 25: ordinal not in range(128)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/__init__.py", line 177, in _run
    return self.run(*args, **kwargs)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/dbcheck.py", line 157, in run
    controls=controls, attrs=attrs)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/dbchecker.py", line 222, in check_database
    error_count += self.check_object(object.dn, attrs=attrs)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/dbchecker.py", line 2315, in check_object
    expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (parent_dn))


Reverting to Samba 4.8.9 makes the error disappear. This particular server is running CentOS 6.10.

Thank you.
Comment 36 Karolin Seeger 2019-03-06 08:43:06 UTC
Re-assigning to Noel as it still seems to be an issue.
Comment 37 Noel Power 2019-03-06 11:12:34 UTC
(In reply to Karolin Seeger from comment #36)
The code afaics isn't in 4.9.4

Author: Karolin Seeger <kseeger@samba.org>
Date:   Thu Dec 20 09:23:46 2018 +0100

    VERSION: Disable GIT_SNAPSHOT for the 4.9.4 release.


but this only hit 4.9-test on 2019-01-31 09:41:52 UTC

should I reassign back to you Karolin ? and change the status back to resolved

would be great if this could be verified too

@Miguel Medalha are you in a position to test from the v4-9-test branch from source?
Comment 38 Miguel Medalha 2019-03-06 12:24:34 UTC
The release notes for 4.9.2 contain the following:

o  Noel Power <noel.power@suse.com>
   * BUG 13616: ldb: Bump ldb version to 1.4.3, Python: Ensure ldb.Dn can accept
     utf8 encoded unicode.

I assumed that this bug had been considered as solved on that version. That's why I reopened the bug when I found the problem again in version 4.9.4.

I am going to build from 4.9-test and will report later.

Thank you
Comment 39 Miguel Medalha 2019-03-06 12:36:11 UTC
OK. It built as Version 4.9.5-GIT-47fb4ba. "samba-tool dbcheck" gives 0 errors, so it looks like the problem is gone.

I hope that the solution will be included in Release 4.9.5.

Thank you!
Comment 40 Noel Power 2019-03-06 14:04:41 UTC
(In reply to Miguel Medalha from comment #38)
yeah, there was an initial fix (which didn't quite go far enough), then there was a secondary fix (which is still in the pipeline, it will be in the next release)
Comment 41 Noel Power 2019-03-06 14:05:41 UTC
(In reply to Miguel Medalha from comment #39)
thanks for testing, I appreciate that. Good to hear it is working for you now
Comment 42 Miguel Medalha 2019-03-06 21:54:34 UTC
I don't know if this is somehow related to the work done to fix this bug: I just discovered that on the version I built from samba 4.9-test, if I run samba-tool without parameters, or with the -h or --help parameters, it produces the following output:

ERROR(<type 'exceptions.SyntaxError'>): uncaught exception - invalid syntax (graph.py, line 91)
  File "/usr/local/samba/bin/samba-tool", line 45, in <module>
    retval = cmd._run("samba-tool", subcommand, *args)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/__init__.py", line 247, in _run
    cmd = self.subcommands[cmd_name]
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/main.py", line 35, in __getitem__
    fromlist=['cmd_%s' % attr]),
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/visualize.py", line 33, in <module>
    from samba.graph import dot_graph


Samba 4.8.9 produces the normal information about available commands.
Comment 43 Andrew Bartlett 2019-03-06 21:59:10 UTC
(In reply to Miguel Medalha from comment #42)
Thanks, this is a different issue and needs a new bug.  I would warn that Python 2.6 is essentially found to be broken after every new Samba release so it would be best to just avoid it.
Comment 44 Miguel Medalha 2019-03-06 22:10:53 UTC
My problem is that on a particular server I still need to use CentOS 6.10.

Is it possible to build Samba against a newer Python version instead of the CentOS 6.x default, without disrupting the OS?


Looking at the error message again, could it be related to the "visualize" command? "samba-tool visualize --help" gives a similar error. All the other samba-tool sub-commands behave correctly with --help.
Comment 45 Douglas Bagnall 2019-03-06 22:46:11 UTC
Created attachment 14906 [details]
patch for comment 42

Miguel: the attached patch should fix the problem, but if you want it backported to 4.9 you'll need to open another bug.

I'm afraid I don't know much about using modern Python on Centos 6, though I believe it has been done.
Comment 46 Miguel Medalha 2019-03-07 10:58:43 UTC
Thank you for the patch.

Unfortunately, it does not solve the problem and now "samba-tool dbcheck" is giving errors again. I will revert to Samba 4.8.9 for now until this situation is clarified. Thank you again.
Comment 47 Douglas Bagnall 2019-03-07 20:21:18 UTC
(In reply to Miguel Medalha from comment #46)

What are the errors?
Comment 48 Miguel Medalha 2019-03-07 21:51:29 UTC
(In reply to Douglas Bagnall from comment #47)

I am not at work now, so I don't have them here.

I am now concentrating on building Samba against Python 3.x on CentOS 6.10. Building right now...
Comment 49 Miguel Medalha 2019-03-13 20:03:44 UTC
(In reply to Douglas Bagnall from comment #47)

Both "samba-tool" and "samba-tool --help" produce teh following error:


ERROR(<type 'exceptions.SyntaxError'>): uncaught exception - invalid syntax (graph.py, line 91)
  File "/usr/local/samba/bin/samba-tool", line 45, in <module>
    retval = cmd._run("samba-tool", subcommand, *args)
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/__init__.py", line 247, in _run
    cmd = self.subcommands[cmd_name]
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/main.py", line 35, in __getitem__
    fromlist=['cmd_%s' % attr]),
  File "/usr/local/samba/lib64/python2.6/site-packages/samba/netcmd/visualize.py", line 33, in <module>
    from samba.graph import dot_graph
Comment 50 Andrew Bartlett 2019-03-13 20:42:35 UTC
(In reply to Miguel Medalha from comment #49)
Please discuss this on a new bug.
Comment 51 Miguel Medalha 2019-03-14 00:13:57 UTC
(In reply to Andrew Bartlett from comment #50)

Filed as Bug 13837.

https://bugzilla.samba.org/show_bug.cgi?id=13837