Bug 13846 - cross-compile will not take cross-answers or cross-execute
Summary: cross-compile will not take cross-answers or cross-execute
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Build (show other bugs)
Version: 4.10.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Douglas Bagnall
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-20 07:05 UTC by pinglin
Modified: 2019-12-18 08:25 UTC (History)
12 users (show)

See Also:


Attachments
workaround for the issue (3.42 KB, patch)
2019-03-20 07:05 UTC, pinglin
no flags Details
fix rpath, related to the previous commit (1.08 KB, text/plain)
2019-04-03 02:23 UTC, pinglin
no flags Details
add exec_args in test_exec (1.95 KB, text/plain)
2019-08-25 14:55 UTC, pinglin
no flags Details
a possible fix (6.85 KB, patch)
2019-10-06 21:50 UTC, Uri Simchoni
no flags Details
proposed fix with added tests (35.71 KB, patch)
2019-10-10 11:57 UTC, Uri Simchoni
no flags Details
fix for Samba 4.11.next (36.42 KB, patch)
2019-10-20 18:10 UTC, Uri Simchoni
abartlet: review+
uri: ci-passed+
Details
fix for Samba 4.10.x (36.79 KB, patch)
2019-10-23 17:24 UTC, Uri Simchoni
uri: review? (abartlet)
metze: review+
uri: ci-passed+
Details
fix for 4.11.next with ldb version bump (59.79 KB, patch)
2019-11-28 21:00 UTC, Uri Simchoni
uri: review? (abartlet)
metze: review+
uri: ci-passed+
Details
pyembed fail log openwrt (4.02 MB, text/plain)
2019-12-04 19:44 UTC, andieq
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description pinglin 2019-03-20 07:05:20 UTC
Created attachment 14955 [details]
workaround for the issue

Overview:

There are 2 problems after 4.10 upgrading the waf:
1) run_prefork_process may not use cross_Popen (which is the wrapper of Utils.subprocess.Popen)
2) cross compile argument is not taken (Specific, exec_args which we get from conf.SAMBA_CROSS_ARGS)

Steps to Reproduce:

    1) ./configure --cross-compile --cross-answers=XXX
    2) the cross-answers will not apply.

Build Date:
    after 4.10.0 rc
Comment 1 pinglin 2019-04-03 02:23:26 UTC
Created attachment 15040 [details]
fix rpath, related to the previous commit
Comment 2 Jeffery To 2019-06-08 07:31:25 UTC
A few of us (at OpenWrt) have been looking to upgrade the Samba package from 4.9.8 to 4.10.2 and we're seeing this same issue (https://github.com/openwrt/packages/pull/8944). I've tried 4.10.4 with the same results.

I've also tried the changes proposed in this report and they appear to work correctly.

Would be great if this can be addressed soon :-)
Comment 3 Stefan Metzmacher 2019-06-12 10:32:26 UTC
Comment on attachment 14955 [details]
workaround for the issue

>From a197e0cafb276a9b732f914b1f679ebb487b47f1 Mon Sep 17 00:00:00 2001
>From: pinglin <pinglin@synology.com>
>Date: Tue, 19 Mar 2019 20:46:27 +0800
>Subject: [PATCH] cross_compile argument doesn't apply
>
>reproduce:
>	./configure --cross-compile --cross-answers=XXX
>
>The output log now will show correct cross-answers.
>---
> third_party/waf/waflib/Context.py        | 20 ++++++++++++++++++--
> third_party/waf/waflib/Tools/c_config.py | 11 +++++++----
> 2 files changed, 25 insertions(+), 6 deletions(-)

For third_party/waf https://gitlab.com/ita1024/waf is the master repository,
could you submit the changes there, so we can pull them from there later?

Thanks!
Comment 4 Stefan Metzmacher 2019-06-12 10:40:37 UTC
Can you think of a way to extend the samba-xc in script/autobuild.py
in order to trigger the problem?

That would be very useful as a future regression test.
Comment 5 andieq 2019-07-09 11:58:56 UTC
Ok can confirm that the two patches work for samba 4.10.6 and it builds again for openwrt, yet i also needed the new patches from gentoo/Alpine(musl) for 4.10.

PS: Be aware that there is also a new bug "waf install --destdir=.. --targets=.." does not work anymore and will not install the build files into destdir.
Comment 6 Stefan Metzmacher 2019-07-09 16:06:25 UTC
Thomas would you be able to have a look at this waf related problems?

Thanks!
Comment 7 Thomas Nagy 2019-07-09 16:44:41 UTC
Propagating that "exec_args" argument to Context.py is highly questionable.

1. Since cross-compilation with a Qemu wrapper is uncommon, can you set the environment variable "export WAF_NO_PREFORK=1" before running such builds?
2. If it is impractical to set such environment variable before waf is launched, can you instead apply something similar to this to the sambas scripts?
https://gitlab.com/ita1024/waf/blob/master/waflib/Utils.py#L1032
3. Since the argument "args" is already added to the command, what is the point of adding it to exec_args again? Can you elaborate as to which rpath issues are fixed by the following and why?
-             self.generator.bld.cmd_and_log([self.inputs[0].abspath()] + args, env=env)
+             self.generator.bld.cmd_and_log([self.inputs[0].abspath()] + args, env=env, exec_args=args)
4. You may redefine `class test_exec(Task.Task):` in the Samba scripts without changing the Waf files. If the Waf files should be changed, then at least one example should be provided *and* the test should be re-run when the arguments are modified (prevent accidental caching). Would `test_args` make a better name for such parameter?
5. Note: waf contributions are under this license https://gitlab.com/ita1024/waf/blob/master/waf-light#L5 (not the GPL), so Waf patches should be submitted to the Waf issue tracker, not here.
Comment 8 pinglin 2019-07-10 05:01:06 UTC
(In reply to Thomas Nagy from comment #7)
Thanks for the suggestions!
I’ll work on this problem soon.
Comment 9 andieq 2019-08-23 10:49:39 UTC
Any update in this? Just retested 4.10.7 and 4.11.0rc2, which still fail to cross-compile.
Comment 10 Neil MacLeod 2019-08-23 13:59:32 UTC
Sorry for the "me too", but I've not had any luck cross-compiling 4.10.y due to this and another bug I've raised[1].

From discussions in #samba-technical I don't think anyone is actively developing/fixing cross-compilation in 4.10.y+, apparently most of the core developers don't use/test cross-compilation.

1. https://bugzilla.samba.org/show_bug.cgi?id=13856
Comment 11 Stefan Metzmacher 2019-08-23 14:53:37 UTC
(In reply to Neil MacLeod from comment #10)

We do care and have added a regression test for it,
but it seems the test wasn't able to detect the problem.

I have a least was able to see that 

4.9 gives:
cat bin-xe/cross-answers.txt | wc -l
39

and 4.10 gives
cat bin-xe/cross-answers.txt | wc -l
1

using the samba-xc autobuild task
Comment 12 pinglin 2019-08-25 14:55:35 UTC
Created attachment 15424 [details]
add exec_args in test_exec
Comment 13 pinglin 2019-08-25 15:07:32 UTC
(In reply to Thomas Nagy from comment #7)

The Waf files don't need to be changed.
Propagating "exec_args" argument to Context.py is totally no needed too.

As you mentioned, we just need to set environment variable (WAF_NO_PREFORK=1) and change test_exec in samba script.
This can let cross-answer work as before.

BTW, I'm not sure which file is proper to put the code, so I put it in samba_waf18.py.


Sorry for another question, why run_regular_process can work well in this case but run_prefork_process can't?
Comment 14 Thomas Nagy 2019-08-25 20:25:47 UTC
@pinglin:
1. I am glad that you found about WAF_NO_PREFORK
2. "why run_regular_process can work well in this case but run_prefork_process can't" - the samba scripts override `subprocess.Popen` to execute processes through QEMU
3. Since no reply was posted in a long time, a parameter called "test_args" was added to waf 2.0.18: https://gitlab.com/ita1024/waf/blob/master/waflib/Tools/c_config.py#L662
Comment 15 Lutz Justen 2019-09-12 15:15:33 UTC
I'd like to gently ping this thread. We're looking into uprev'ing to 4.10.7 for Chrome OS and we're also running into this issue. The patches work for us, so we were wondering what needs to be done until the issue can be fixed in Samba. Is upstreaming them to Gentoo for now an option? Is there any risk of breaking things?
Comment 16 andieq 2019-09-18 10:21:48 UTC
Just tested 4.11.0 and same problem, so cross-compiling is still broken for 4.10/11 branch.
Comment 17 andieq 2019-09-18 10:39:49 UTC
(In reply to andieq from comment #16)

With WAF_NO_PREFORK=1 set, i get this:

Traceback (most recent call last):
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Scripting.py", line 159, in waf_entry_point
    run_commands()
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Scripting.py", line 255, in run_commands
    ctx = run_command(cmd_name)
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Scripting.py", line 239, in run_command
    ctx.execute()
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Configure.py", line 159, in execute
    super(ConfigurationContext, self).execute()
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Context.py", line 204, in execute
    self.recurse([os.path.dirname(g_module.root_path)])
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Context.py", line 286, in recurse
    user_function(self)
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/wscript", line 167, in configure
    conf.RECURSE('dynconfig')
  File "./buildtools/wafsamba/samba_utils.py", line 66, in fun
    return f(*k, **kw)
  File "./buildtools/wafsamba/samba_utils.py", line 481, in RECURSE
    return ctx.recurse(relpath)
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/third_party/waf/waflib/Context.py", line 286, in recurse
    user_function(self)
  File "/root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/dynconfig/wscript", line 342, in configure
    value = EXPAND_VARIABLES(conf, dyn_vars[varname])
  File "./buildtools/wafsamba/samba_utils.py", line 356, in EXPAND_VARIABLES
    ret = SUBST_VARS_RECURSIVE(ret, ctx.env)
  File "./buildtools/wafsamba/samba_utils.py", line 324, in SUBST_VARS_RECURSIVE
    string = subst_vars_error(string, env)
  File "./buildtools/wafsamba/samba_utils.py", line 264, in subst_vars_error
    raise KeyError("Failed to find variable %s in %s in env %s <%s>" % (vname, string, env.__class__, str(env)))
KeyError: 'Failed to find variable PERL_LIB_INSTALL_DIR in ${PERL_LIB_INSTALL_DIR}
Comment 18 andieq 2019-09-18 12:59:21 UTC
ok got 4.11 to compile, using the patch provided here and setting WAF_NO_PREFORK=1.

Yet there still is a problem regarding asn1_compile, compile_et for a embedded heimdal build. The embedded version never checks for system (asn1_compile, compile_et) so it always builds them, but for a cross-compile it builds target compatible bin's not host compatible ones.

So the cross-compile build fails with:
[240/241] Linking bin/default/source4/heimdal_build/asn1_compile
[241/241] Linking bin/default/source4/heimdal_build/compile_et
[242/414] Processing source4/heimdal/lib/asn1/kx509.asn1
[243/414] Processing source4/heimdal/lib/asn1/digest.asn1
[244/414] Processing source4/heimdal/lib/hdb/hdb.asn1
[245/414] Processing source4/heimdal/lib/gssapi/mech/gssapi.asn1
/bin/sh: /root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/bin/asn1_compile: cannot execute binary file: Exec format error
/bin/sh: /root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/bin/asn1_compile: cannot execute binary file: Exec format error
/bin/sh: /root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/bin/asn1_compile: cannot execute binary file: Exec format error
/bin/sh: /root/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/samba-4.11.0/bin/asn1_compile: cannot execute binary file: Exec format error

To fix the issue i needed to add the logic back to source4/heimdal_build/wscript_configure

def check_system_heimdal_binary(name):
    if conf.LIB_MAY_BE_BUNDLED(name):
        return False
    if not conf.find_program(name, var=name.upper()):
        return False
    conf.define('USING_SYSTEM_%s' % name.upper(), 1)
    return True

check_system_heimdal_binary("compile_et")
check_system_heimdal_binary("asn1_compile")
Comment 19 andieq 2019-09-20 11:09:27 UTC
ok next problem occurs if we try build with AD-DC support:
--------------------------------------------------------
Checking for program 'python3'                                                    : /root/openwrt/staging_dir/hostpkg/bin/python3
Checking for program 'python'                                                     : /root/openwrt/staging_dir/hostpkg/bin/python3
Checking for program 'python3'                                                    : /root/openwrt/staging_dir/hostpkg/bin/python3
Checking for python version >= 3.4.0                                              : 3.7.4
python-config                                                                     : /root/openwrt/staging_dir/hostpkg/bin/python3-config
Asking python-config for pyembed '--cflags --libs --ldflags --embed' flags        : not found
Asking python-config for pyembed '--cflags --libs --ldflags' flags                : yes
Testing pyembed configuration                                                     : Could not build a python embedded interpreter
The configuration failed
--------------------------------------------------------

Looks like the test tries to link against the host version of libpython3.7.a, instead of the target version. 

--------------------------------------------------------
...
#define PYTHONDIR "/usr/lib/python3.7/site-packages"
#define PYTHONARCHDIR "/usr/lib/python3.7/site-packages"
#define HAVE_PYEMBED 1
...
arm-openwrt-linux-muslgnueabi/bin/ld: /root/openwrt/staging_dir/hostpkg/lib/python3.7/config-3.7/libpython3.7.a: error adding symbols: file format not recognized
--------------------------------------------------------

I'm not sure how/where to correctly define the python3 paths so WAF can use the host version for the build itself and those tests and target modules can use the target version of python3?

PS: This all worked in samba-4.9 without problems.
Comment 20 Uri Simchoni 2019-10-06 21:50:35 UTC
Created attachment 15511 [details]
a possible fix

Attaching a fix following Thomas Nagy's guidelines - this backports test_args from waf 2.0.18 and disables pre-forking for cross-build without need to set WAF_NO_PREFORK (i.e. it should "just work" and survive waf upgrade).

I'd appreciate if folks can try this out before submitting to master. Submission to master also needs more work to make detection of a future degradation likelier.
Comment 21 Stefan Metzmacher 2019-10-07 09:10:53 UTC
Comment on attachment 15511 [details]
a possible fix

Hi Uri,

thanks for looking into this!

Can you please import the latest waf release completely
instead of just one patch?
Comment 22 Thomas Nagy 2019-10-07 18:48:03 UTC
Comment on attachment 15511 [details]
a possible fix

I have not tried it, but this change looks good!
Comment 23 Uri Simchoni 2019-10-10 11:57:06 UTC
Created attachment 15526 [details]
proposed fix with added tests

v2 of the fix is essentially the same, except:
1. it upgrades waf to 2.0.18 instead of backporting the required feature
2. it enhances the test suite to better detect future degradation

If someone has already begun testing the previous patch, pls go on - this version is is essentially the same.
Comment 24 Neil MacLeod 2019-10-17 19:32:22 UTC
Not sure if this is related to this issue, but now when building with the patches from comment #23 and latest v4.11-test (7f5334a92c4a378f88c0ee8c5fde46dd087a9dc0) I'm getting this build failure:

http://ix.io/1YYW

I see the same failure with 4.11.0.

Without the patches from #23, 4.11.0 fails as follows: http://ix.io/1YYP - it's hard to say if it's any better or worse, as the order of the build is very different now. But the failure is at least different.

I'm cross-compiling on Ubuntu 16.04/x86_64 to arm/32-bit, with a toolchain based on gcc-9.2.0, glibc-2.30, Python 3.7.5, kernel 5.3.5. Heimdal-7.7.0 is built "externally" and is part of the toolchain.

Samba 4.9.13 builds successfully with this same cross-compile toolchain (only difference is that Python 2.7.16 is used instead of Python 3.7.5).

If not related to this issue, I'm happy to open another bug.
Comment 25 Uri Simchoni 2019-10-17 21:07:56 UTC
(In reply to Neil MacLeod from comment #24)

The "proposed fix" is strictly for the cross-answers issue, which is the subject of this bug, but it looks like there are some other issues.

The "previous" failure (http://ix.io/1YYP) is clearly not related, clearly a samba bug, and it's https://bugzilla.samba.org/show_bug.cgi?id=13856.

It's less clear what happens in http://ix.io/1YYW - from the build log it looks like asn1_compile, which is a binary program generated (IIUC) from external heimdal via the heimdal:host dependency, cannot compile the asn1 files of Samba's heimdal. One would think it's some incompatibility with the versions, but clearly this isn't the case because Samba 4.9.x builds just fine with same asn1_compile (and probably those file haven't changed in years). I suppose a verbose build (make V=1) and comparison of the asn1_compile commands between samba 4.9.x and 4.11.x may shed more light on this.

Since buildroot seems to be successfully building Samba 4.10.8, I'll validate this fix by building buildroot without the relevant buildroot patches (the WAF_NO_PREFORK and 0004-cross_compile-fix.patch). I'll also tackle https://bugzilla.samba.org/show_bug.cgi?id=13856 which is clearly a Samba bug and has a patch in buildroot as well - 0006-heimdal_build-wscript_build-do-not-add-host-include-.patch.
Comment 26 Neil MacLeod 2019-10-17 21:34:18 UTC
Thanks Uri.

#13856 is a bug I opened, but with 4.11.0 and the patch in this thread I no longer have that exact failure (which is why I wondered if it is connected) although maybe it's just because the ASN1 syntax error failure is now occurring before the failure I documented in #13856 (ie. it's not fixed, it's just that the build isn't getting as far as it did before).

The only change now is that previously with 4.10.x and the Heimdal failure I had been using Python-2.7.16 to build Heimdal-7.7.0+Samba-4.10.x, but now I'm using Python-3.7.5 to build Heimdal-7.7.0+Samba-4.11.0+patches and I'm experiencing this ASN1 failure. Without the patches, I'm back to #13856 (heimdal "cross-compile badness").

I'll also test the Buildroot heimdal patch as that looks interesting - thanks for the heads up! :)
Comment 27 Neil MacLeod 2019-10-18 17:15:40 UTC
I'm still getting the ASN1 syntax error failures with the patches from #23 and also buildroot[1].

The most obvious issue is that despite configuring Samba 4.11.x with:

--bundled-libraries='ALL,!asn1_compile,!compile_et,!zlib' \

the Samba build is using the bundled asn1_compile, which is failing, and not the toolchain/bin/asn1_compile supplied by heimdal:

/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/samba-7f5334a92c4a378f88c0ee8c5fde46dd087a9dc0/bin/asn1_compile: 1: /home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/samba-7f5334a92c4a378f88c0ee8c5fde46dd087a9dc0/bin/asn1_compile: Syntax error: word unexpected (expecting ")")

It looks like samba-4.11.x is ignoring (or at least not processing correctly) the --bundled-libraries option.

In 4.9.13 I see this:

Checking for program compile_et                                                                 : /home/neil/projects/scratch/alternates/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/toolchain/bin/compile_et
Checking for program asn1_compile                                                               : /home/neil/projects/scratch/alternates/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/toolchain/bin/asn1_compile

but in 4.11.x it doesn't log anything for asn1_compile or compile_et.

I'll open a new bug for this ASN1 issue.

1. https://github.com/buildroot/buildroot/blob/8b11b96f41a6ffa76556c9bf03a863955871ee57/package/samba4/0006-heimdal_build-wscript_build-do-not-add-host-include-.patch
Comment 28 Ralph Böhme 2019-10-19 09:34:56 UTC
Björn, can you take a look? Iirc you've been succesfully cross-compiling Debian ARM in the past.
Comment 29 Uri Simchoni 2019-10-20 18:10:44 UTC
Created attachment 15559 [details]
fix for Samba 4.11.next

(Taking the bug since the fix has landed in master)
One thing I contemplated over regarding the backport was whether to apply the waf upgrade as-is, or just cherry-pick the relevant fix. Looking at the content of the diff, I don't see reason for regression (there's one RPATH-related fix, but it seems to be doing the same thing, only more cautiously).
Comment 30 Andrew Bartlett 2019-10-20 18:19:25 UTC
G'Day Karolin,

Please select for the next available 4.10 and 4.11 releases!

Thanks!
Comment 31 Karolin Seeger 2019-10-23 09:00:47 UTC
(In reply to Andrew Bartlett from comment #30)
Pushed to autobuild-v4-11-test.
Does not apply on current v4-10-test.
Comment 32 Uri Simchoni 2019-10-23 09:09:52 UTC
(In reply to Karolin Seeger from comment #31)
Oops, sorry, I clearly didn't test this on 4.10 (did test on 4.11) - the autobuild extra-testing of the last hunk doesn't apply.
Comment 33 Uri Simchoni 2019-10-23 17:24:30 UTC
Created attachment 15571 [details]
fix for Samba 4.10.x

Attaching a fix for 4.10.next, only the test - patch 5/5 - has changed.

Passed CI - https://gitlab.com/samba-team/devel/samba/pipelines/90853139

Thanks and sorry for not properly testing this.
Comment 34 Andrew Bartlett 2019-10-23 17:58:35 UTC
Because this (and bug 13960) changes code that is common to all the subprojects, all the version numbers of the sub-projects need to be bumped, otherwise we will get complaints that (eg) the internal talloc and the talloc tarball don't build identically.

For the projects that share a version sequence with master, that bump should be in master and then backported, for projects with a forked version sequence (ldb) then a fresh commit in each branch is the right thing.
Comment 35 Uri Simchoni 2019-10-24 12:18:19 UTC
(In reply to Andrew Bartlett from comment #34)
If my comment #9 from bug #13960 is correct, then I'd say we only need to bump the libs with public Python bindings which are (if I'm not mistaken) talloc, tdb, tevent, and ldb - all to make release of those possible and fix bug 13960.

About other libs (e.g. smbclient, wbclient) - I don't think so. Can the mere fact that we've changed something in the build system, without indication that it changes anything in the ABI or behavior of the produced binaries (and indeed it shouldn't change - it's a minor bugfix waf upgrade), be grounds for version bump?
Comment 36 Andrew Bartlett 2019-10-24 17:23:29 UTC
(In reply to Uri Simchoni from comment #35)
Correct, only those libs where the ABI is directly linked to the package version need work.  Things within Samba are just fine as they are.
Comment 37 Stefan Metzmacher 2019-10-24 20:34:23 UTC
(In reply to Andrew Bartlett from comment #36)

Ah, we need the same for waf 2.0.18...

So we need new talloc, tdb, tevent versions in master.
And new ldb versions in 4.10 and 4.11.
I'll have a look at that tomorrow.
Comment 38 Stefan Metzmacher 2019-10-24 20:35:24 UTC
(In reply to Stefan Metzmacher from comment #37)

Do people really need to cross-compile the standalone libraries?
Comment 39 Uri Simchoni 2019-10-24 20:41:54 UTC
(In reply to Stefan Metzmacher from comment #38)
IMO it's unlikely. I'd say the bug report is on Samba, not on libraries (in contrast with bug #13960). Therefore we don't need to "fix" the stand-alone library "cross-compileness" and don't need another version for that.
Comment 40 Jeffery To 2019-10-24 23:10:28 UTC
Not sure if it's relevant, but there are some folks trying to get libtalloc to cross-compile with Python 3 for OpenWrt: https://github.com/openwrt/packages/pull/9686

I believe libtalloc is a dependency for freeradius (and possibly others, I'm not sure).
Comment 41 Uri Simchoni 2019-10-25 08:16:23 UTC
(In reply to Jeffery To from comment #40)

Yes, I stand corrected. OpenWRT uses libtalloc as an independent component, and they cross-compile it using a cross-answers file - https://github.com/openwrt/packages/blob/master/libs/libtalloc/Makefile

So if we want the latest Samba lib to use same build process as latest Samba, we need a new release.

In practical terms, libtalloc 2.3.0 cross-build (based on waf 2.0.17) is not broken because libtalloc is platform-neutral.
Comment 42 Uri Simchoni 2019-11-16 20:55:38 UTC
So, assuming people need cross-building libraries (there's one example, although it's for talloc and talloc doesn't care much about results of cross-answers), here's what we need to do:

master:
- talloc - bump to 2.3.1
- tdb - bump to 1.4.3
- tevent - bump to 0.10.2
- ldb - is at 2.1.0, which hasn't been released, we can have this in release notes, nothing to do right now

4.11 branch:
- talloc - bump to 2.2.1
- tdb - cherry-pick 1.4.3 bump from master
- tevent - cherry-pick both 0.10.1 and 0.10.2 version bumps from master (0.10.1 version bump wasn't backported)
- bump ldb to 2.0.8

4.10 branch:
- bump talloc to 2.1.17
- bump tdb to 1.3.19
- bump tevent to 0.9.40
- bump ldb to 1.5.7

New releases to be made:
talloc 2.3.1 (off master)
talloc 2.2.1 (off v4-11-test)
talloc 2.1.17 (off v4-10-test)
tdb 1.4.3 (off master)
tdb 1.3.19 (off v4-10-test)
tevent 0.10.2 (off master)
tevent 0.9.40 (off v4-10-test)
ldb 2.1.0 (off master, towards Samba 4.12 release)
ldb 2.0.8 (off v4-11-test)
ldb 1.5.7 (off v4-10-test)

What I need to do is:
1. prepare version bump patches for master
2. once landed, prepare an updated fix for 4.11.next, including both the actual fix and the version bumps (some cherry-picks, some new)
3. also prepare a new patch for 4.10.next including actual fix version bumps (nothing is cherry-picked)

I'll get working on the master bit, but please shout if this is wrong.
Comment 43 Stefan Metzmacher 2019-11-18 12:02:06 UTC
(In reply to Uri Simchoni from comment #42)

Why a special talloc version from 4.11? Can't we cherry-pick from master?
Comment 44 Stefan Metzmacher 2019-11-18 12:03:43 UTC
(In reply to Stefan Metzmacher from comment #43)

The talloc, tevent, tdb versions from 4.10 are needed because of the python2 removal in 4.11, correct?
Comment 45 Uri Simchoni 2019-11-18 12:16:28 UTC
(In reply to Stefan Metzmacher from comment #43)

The fix itself (upgrading waf and adapting for the use of new waf) is done by cherry-picking from master.

Then there's the story of releasing a fix to talloc-2.2.0, which builds using waf 2.0.17 and hence uses a different waf than samba 4.11.next, and also doesn't handle cross-compilation right. To do that I can use the v4-11 branch (which will also have the cherry-picked fixes) and modify talloc version there from 2.2.0 to 2.2.1. Cherry-picking the version change from master would get me to 2.3.1 with a library that's different from master's 2.3.1.
Comment 46 Uri Simchoni 2019-11-18 12:19:39 UTC
(In reply to Stefan Metzmacher from comment #44)
The motive is to have libraries that are built with same waf version as the main Samba project (consider packagers which first build the libraries and then build samba with "system" libraries), and also to issue a bugfix version of the libraries to fix the cross-compilation issue (e.g. tdb 1.3.19 uses waf 2.0.17 so it has the cross-compilation issue)
Comment 47 Stefan Metzmacher 2019-11-18 12:40:55 UTC
Am 18.11.19 um 13:16 schrieb samba-bugs@samba.org:
> https://bugzilla.samba.org/show_bug.cgi?id=13846
> 
> --- Comment #45 from Uri Simchoni <uri@samba.org> ---
> (In reply to Stefan Metzmacher from comment #43)
> 
> The fix itself (upgrading waf and adapting for the use of new waf) is done by
> cherry-picking from master.
> 
> Then there's the story of releasing a fix to talloc-2.2.0, which builds using
> waf 2.0.17 and hence uses a different waf than samba 4.11.next, and also
> doesn't handle cross-compilation right. To do that I can use the v4-11 branch
> (which will also have the cherry-picked fixes) and modify talloc version there
> from 2.2.0 to 2.2.1. Cherry-picking the version change from master would get me
> to 2.3.1 with a library that's different from master's 2.3.1.

Why? Just backport all fixes from master, so that lib/talloc matches
between master and 4.11.
Comment 48 Uri Simchoni 2019-11-18 12:44:40 UTC
(In reply to Stefan Metzmacher from comment #47)
But the libraries are not the same - _pytalloc_get_name() was added in master and libtalloc advanced from 2.2.0 to 2.3.0.
Comment 49 Stefan Metzmacher 2019-11-18 13:12:19 UTC
Am 18.11.19 um 13:44 schrieb samba-bugs@samba.org:
> https://bugzilla.samba.org/show_bug.cgi?id=13846
> 
> --- Comment #48 from Uri Simchoni <uri@samba.org> ---
> (In reply to Stefan Metzmacher from comment #47)
> But the libraries are not the same - _pytalloc_get_name() was added in master
> and libtalloc advanced from 2.2.0 to 2.3.0.
> 

Something being added is not a problem!

However I found a possible problem in talloc-2.3.0.

commit ac23eeb41c3d27d710722f94e22dd84410d183d3
Author: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Date:   Sun Jul 7 12:34:37 2019 +1200

    talloc/py_util: remove tautologically dead code

    Being careful is good and all, but if we don't trust the

           static PyTypeObject *type = NULL;

    two lines up, we need to reconsider our entire software universe.

    Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
    Reviewed-by: Gary Lockyer <gary@catalyst.net.nz>
    Reviewed-by: Andrew Bartlett <abartlet@samba.org>

@@ -32,10 +32,6 @@ _PUBLIC_ PyTypeObject *pytalloc_GetObjectType(void)
        static PyTypeObject *type = NULL;
        PyObject *mod;

-       if (type != NULL) {
-               return type;
-       }
-
        mod = PyImport_ImportModule("talloc");

Looks strange. This wasn't dead code! I'm wondering if
Douglas really noticed the word 'static'.

I'd like to resolve that first and guess we need talloc-2.3.2 for master
and then backport it to v4-11-test.
Comment 50 Gary Lockyer 2019-11-18 20:51:42 UTC
Well I certainly did (missed the static).
I don't think it introduces an issue.
Comment 51 Douglas Bagnall 2019-11-18 20:56:13 UTC
(In reply to Stefan Metzmacher from comment #49)
> I'm wondering if Douglas really noticed the word 'static'.

Oh no! damn. yes. Sorry.
Comment 52 Uri Simchoni 2019-11-28 21:00:10 UTC
Created attachment 15640 [details]
fix for 4.11.next with ldb version bump

Posting an updated cherry-pick fix for 4.11 with ldb version bump (the rest is the same as before)
The other libs seem not to require a version bump - 4.11 can use the latest libs released off the master branch. It's only desirable to release those libs (talloc 2.3.1, tevent 0.10.2, tdb 1.4.3) before the Samba 4.11 release that has this fix.

The plan is not to fix this for 4.10.x because that would require 4 library releases for Python2 compatibility. The 4.10.x fix in this bug is good for anyone that wants to apply it manually.
Comment 53 andieq 2019-11-29 14:42:12 UTC
@Uri Simchoni, ok tested your latest patch and it works if i also apply the patch for 14164.
Unfortunately 14164 and 14132 still preventing us from directly switching to 4.11 from 4.9.
	
https://bugzilla.samba.org/show_bug.cgi?id=14164
https://bugzilla.samba.org/show_bug.cgi?id=14132
Comment 54 Uri Simchoni 2019-12-01 05:32:15 UTC
Assigning to Karolin for 4.11.next.

Notice again that we're not releasing a 4.10.x version with this fixed due to library (ldb/tdb/talloc/tevent) hassles, but users can use the 4.10.x fix posted on this bug report.
Comment 55 Karolin Seeger 2019-12-03 11:36:35 UTC
(In reply to Uri Simchoni from comment #54)
Pushed to autobuild-v4-11-test.
Comment 56 andieq 2019-12-04 19:44:46 UTC
Created attachment 15664 [details]
pyembed fail log openwrt
Comment 57 andieq 2019-12-04 19:48:42 UTC
(In reply to andieq from comment #53)

Did just retest with 4.11.2 + patch(15640) + python3 support (AD-DC) and still fails, while 4.9.16 works fine.

see log: https://bugzilla.samba.org/attachment.cgi?id=15664
Comment 58 Neil MacLeod 2019-12-05 01:06:35 UTC
v4-11-test[1] builds successfully for x86_64 and ARM targets using these LibreELEC commits[2].

The "fix ASN1" patch is still required for a LibreELEC build, as this fixes the Heimdal bundled/system issue[3].

The manual install workaround for the waf install issue[4] is also still required, as is the workaround to avoid a symbolically linked LOCKDIR[5].

LibreELEC doesn't build with AD-DC support, so this isn't anything I've tested.

1. https://github.com/samba-team/samba/tree/91f39dbda151f6a2768b6e5eff59f931f303721f

2. https://github.com/LibreELEC/LibreELEC.tv/compare/master...MilhouseVH:le10_samba_4.11-test

3. https://bugzilla.samba.org/show_bug.cgi?id=14164

4. https://bugzilla.samba.org/show_bug.cgi?id=14132

5. https://bugzilla.samba.org/show_bug.cgi?id=14166
Comment 59 andieq 2019-12-05 11:33:42 UTC
ok got the AD-DC build working, by mimic what buildroot does:

>CONFIGURE_VARS += python_LDFLAGS="" python_LIBDIR=""
>CONFIGURE_VARS += \
>	PYTHON="$(HOST_PYTHON3_BIN)" \
>	PYTHON_CONFIG="$(STAGING_DIR_ROOT)/usr/bin/python3-config"

So we need to mix the host/bin and target/config.
Comment 60 Karolin Seeger 2019-12-17 12:21:25 UTC
Re-assigning to Douglas as there seem to be some issues left.
Comment 61 Neil MacLeod 2019-12-17 16:20:55 UTC
Many thanks. With 4.11.4 I'm able to drop the patch from this issue, and also the Heimdal cross-compile fix discussed in 13856[1], so both 13846 and 13856 are now fixed for me.

I still need the ASN1 patch from 14164[2]. And installation using waf is still an issue with 4.11.4[4].

LibreELEC adaptations for Samba 4.11.4 are here[3].

1. https://bugzilla.samba.org/show_bug.cgi?id=13856#c11
2. https://bugzilla.samba.org/show_bug.cgi?id=14164#c6
3. https://github.com/LibreELEC/LibreELEC.tv/pull/4064
4. https://bugzilla.samba.org/show_bug.cgi?id=14132
Comment 62 Uri Simchoni 2019-12-18 08:25:14 UTC
The specific issue of cross-answers is fixed now in 4.11.4. There are other issues in this thread which relate to other bugs. Closing this bug.