Bug 13856 - Samba 4.10.0 cross-compile issue when compiling Heimdal
Summary: Samba 4.10.0 cross-compile issue when compiling Heimdal
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Build (show other bugs)
Version: 4.10.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Andrew Bartlett
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-22 12:57 UTC by Neil MacLeod
Modified: 2019-12-19 09:22 UTC (History)
5 users (show)

See Also:


Attachments
4.9.5 configuration (5.92 KB, text/plain)
2019-04-03 00:29 UTC, Neil MacLeod
no flags Details
4.10.0 configuration (5.95 KB, text/plain)
2019-04-03 00:29 UTC, Neil MacLeod
no flags Details
4.9.5 build log (Successful) (551.38 KB, text/plain)
2019-04-03 00:31 UTC, Neil MacLeod
no flags Details
4.10.0 build log (Failure) (250.24 KB, text/plain)
2019-04-03 00:31 UTC, Neil MacLeod
no flags Details
Samba 4.10.6 build failure (248.92 KB, application/octet-stream)
2019-07-09 19:15 UTC, Neil MacLeod
no flags Details
git-am fix for 4.11.next and 4.10.next (1.76 KB, patch)
2019-12-01 18:53 UTC, Uri Simchoni
abartlet: review+
uri: ci-passed+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Neil MacLeod 2019-03-22 12:57:03 UTC
We build Samba for LibreELEC, a custom Linux distribution, cross-compiling with our own gcc-8.3.0 toolchain.

We patch gcc to detect incorrect build host include usage when building for the target[1].

We are able to build Samba 4.9.5 with Heimdal 7.5.0[2] without issue (successful build log[3]) - all is good.

However, we are seeing a cross-compile problem with Samba 4.10.0[4], which now triggers a cross-compilation failure when building the "embedded" Heimdal.

Specifically, /usr/include/heimdal is being used by the Samba 4.10.0 build process which is incorrect when cross-compiling, and this did not happen with Samba 4.9.5.

This is the Samba 4.9.5 configuration: http://ix.io/1DXj

And this is the Samba 4.10.0 configuration: http://ix.io/1DXi (I'm including the fix for https://bugzilla.samba.org/show_bug.cgi?id=13844)

Any ideas why this cross-compile problem happens with Samba 4.10.0 but does NOT happen with Samba 4.9.5?

Note: The "link" error and truncation warnings may also be of interest - the truncation warning is also present in 4.9.5 (search for "_heim_time2generalizedtime").

1. https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/gcc/patches/gcc-crosscompile-badness.patch
2. https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/devel/heimdal/package.mk
3. http://ix.io/1E26
4. http://ix.io/1E1W
Comment 1 David Disseldorp 2019-03-22 13:22:15 UTC
As discussed on IRC, the explicit include path added via source4/heimdal_build/wscript_build appears to be responsible for the failure, but I don't know why it wasn't causing issues with the 4.9 build.

I'm also a little confused as to why heimdal is even built, given the --without-winbind --without-ads --without-ad-dc configure invocation.
Comment 2 Andrew Bartlett 2019-04-03 00:12:58 UTC
(In reply to David Disseldorp from comment #1)
We still use the internal heimdal unless --with-system-mitkrb5 or --with-system-heimdalkrb5

This specifically allowed us to upgrade our minimum kerberos requirement because we could fall back on the internal one (a bit like the way we use third_party).
Comment 3 Neil MacLeod 2019-04-03 00:29:19 UTC
Created attachment 15036 [details]
4.9.5 configuration
Comment 4 Neil MacLeod 2019-04-03 00:29:51 UTC
Created attachment 15037 [details]
4.10.0 configuration
Comment 5 Neil MacLeod 2019-04-03 00:31:21 UTC
Created attachment 15038 [details]
4.9.5 build log (Successful)
Comment 6 Neil MacLeod 2019-04-03 00:31:50 UTC
Created attachment 15039 [details]
4.10.0 build log (Failure)
Comment 7 Neil MacLeod 2019-07-09 19:15:08 UTC
Created attachment 15296 [details]
Samba 4.10.6 build failure

This is still a problem when building 4.10.6.

4.9.11 builds fine.
Comment 8 andieq 2019-07-10 13:20:32 UTC
(In reply to Neil MacLeod from comment #7)

You need this patch for musl as well: https://github.com/Andy2244/openwrt-extra/blob/master/samba4/patches/006-samba-4-10-musl_rm_unistd_incl.patch

Just got 4.10.6 working for openwrt, so maybe compare our version/patches with your LibreELEC version.
Comment 9 Neil MacLeod 2019-07-10 17:21:06 UTC
Hi @andieq, many thanks for the suggestion but unfortunately it had no effect - Samba 4.10.x continues to fail as before. To be honest I wasn't expecting it to fix this issue as we build with glibc.

This is my Samba 4.10 development branch:

https://github.com/LibreELEC/LibreELEC.tv/compare/master...MilhouseVH:le10_samba-4.10.0

There's not much change from 4.9.11, but Samba 4.10.x no longer cross-compiles due to the WAF build system pulling in the host includes rather than from the sysroot. Note that this failure is a custom modification to gcc so you may not see it in your build system:

https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/gcc/patches/gcc-crosscompile-badness.patch

We are using an external Heimdal (heimdal-7.7), but that hasn't changed from 4.9.x.

I don't think additional cross-answers are required by 4.10, the cross-answers file I'm using is here:

https://github.com/MilhouseVH/LibreELEC.tv/blob/17eb8c954d1742e7b89e4bdb36e8714af7f34156/packages/network/samba/config/samba4-cache.txt

Looking at your repo it appears you didn't need to make any cross-answer modifications.
Comment 10 andieq 2019-07-11 08:15:22 UTC
(In reply to Neil MacLeod from comment #9)

Ah just noticed the "link" redefinition, which i also got on musl, so had to grab the Alpine patch for it. Was assuming you guys also use musl, sorry this dind't work for you.

Yeah no cross-answer changes, i needed the 2 waf cross-compile patches, the gnutls patch and the musl one. Other than that "waf install --destdir" wasn't working anymore, so now have to manually grab everything from /bin....

I also could not get the AD_DC version building, because it pulls in mixed lib's (target/host) on it trying to build the "python embedded interpreter". Something is different from python2 to 3....
Comment 11 Neil MacLeod 2019-10-19 14:59:32 UTC
The Heimdal patch from buildroot[1] is an effective workaround for this bug when cross-compiling, and allows Samba 4.11.1 to cross-compile successfully (at least on x86_64 -> x86_64, due to bug #14164). This is only a workaround, however, and probably not an ideal fix as presumably it may cause issues when not cross-compiling.

1. https://github.com/buildroot/buildroot/blob/8b11b96f41a6ffa76556c9bf03a863955871ee57/package/samba4/0006-heimdal_build-wscript_build-do-not-add-host-include-.patch
Comment 12 Andrew Bartlett 2019-10-19 19:26:24 UTC
(In reply to Neil MacLeod from comment #11)
That patch looks correct.

I disagree with Jelmer and the patch 257e259a2603 that introduced this.

While we may wish to build against a system Heimdal, in that case we should not be building asn1_compile.  Building against a system roken but not a full Heimdal is not a supported use case. 

Can someone submit this a Merge Request?

Thanks!

Andrew Bartlett
Comment 13 Neil MacLeod 2019-10-19 23:58:56 UTC
Thanks Andrew.

There's another buildroot patch which might be of interest as it relates to this bug as well:

https://github.com/buildroot/buildroot/blob/593a60f7f0fa1489175700c7b2eda0666347faba/package/samba4/0005-fix_unistd_incl.patch

It fixes the "'link' redeclared as different kind of symbol" error mentioned in the first comment.

source4/heimdal/lib/asn1/asn1_err.c:47:23: error: 'link' redeclared as different kind of symbol
   47 | static struct et_list link = { 0, 0 };
      |                       ^~~~
In file included from ../../lib/replace/../replace/replace.h:172,
                 from ../../source4/heimdal_build/config.h:10,
                 from source4/heimdal/lib/asn1/asn1_err.c:1:
/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/toolchain/armv7ve-libreelec-linux-gnueabi/sysroot/usr/include/unistd.h:789:12: note: previous declaration of 'link' was here
  789 | extern int link (const char *__from, const char *__to)
      |            ^~~~

Waf: Leaving directory `/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/samba-4.11.1/bin/default'
Build failed
 -> task in 'HEIMDAL_HEIM_ASN1' failed with exit status 1:
        {task 76480912: c asn1_err.c -> asn1_err.c.60.o}


^^^ above is from 4.11.1.
Comment 14 Andrew Bartlett 2019-10-20 00:48:24 UTC
(In reply to Neil MacLeod from comment #13)
I can't see that symbol in a current master build without cross-compile or using the host heimdal for compile_et. 

Are you sure this isn't introduced by fixing bug 14164 the wrong way?
Comment 15 Neil MacLeod 2019-10-20 01:15:32 UTC
I can't really tell you where the issue is coming from, only that it's present in 4.10.x, 4.11.x and also 4.12.0 (master), and the buildroot patch I mentioned in c#13 is essential. All these branches will fail when cross-compiling without the buildroot fix.

This is not an issue in 4.9.x.

Just to be clear: I'm cross-compiling to arm on x86_64, with heimdal-7.7.0 providing the x86_64 asn1_compile and compile_et binaries.
Comment 16 Uri Simchoni 2019-10-20 05:47:21 UTC
(In reply to Neil MacLeod from comment #13)
This patch doesn't look right, because it removes unistd.h inclusion on "some" systems and not in others. It also already creates issues with master (you don't see them yet as you're building 4.11 / 4.10).

Maybe to get to the bottom of this issue, we need to understand on which toolchain it appears. The patch originates as a musl patch - are you using musl? If so, maybe the correct thing would be to open a bug on building with musl. Given enough time I'll also get there (need to try my glibc-based buildroot build without this patch, and if successful, try musl, see if the build breaks, and figure it out then).
Comment 17 Neil MacLeod 2019-10-20 05:56:38 UTC
(In reply to Uri Simchoni from comment #16)

My target toolchain is based on glibc-2.30, gcc-9.20, kernel 5.3.5.

Musl is not used.

LibreELEC is the distribution I'm building (for x86_64 and arm).
Comment 18 Andrew Bartlett 2019-10-20 06:00:37 UTC
(In reply to Neil MacLeod from comment #17)
I'm quite sure the 'link' issue is from using a non-Samba compile_et.

I'm sceptical that it is from Heimdal, it isn't in Heimdal 7.7.0's compile_et, but is in that old MIT one.
Comment 19 Neil MacLeod 2019-10-20 07:17:16 UTC
We're only building Heimdal, there's no MIT (unless it's installed as standard on Ubuntu 16.04 and is leaking over from the host) - it's certainly not a part of the LibreELEC build.

It looks like e2fsprogs is installing a shell script named compile_et, and we're not actually installing the compile_et binary from Heimdal - which has surprised me (sorry about that)... but we are definitely installing the asn1_compile binary from Heimdal.

I'm not sure if the e2fsprogs compile_et shell script is the source of the "'link' redefined'" error, could it be...? Why wouldn't we get this with 4.9.13?

I can switch to using the COMPILE_ET / ASN1_COMPILE environment variables to specify the exact locations of the Heimdal compile_et/asn1_compile binaries (I'll call it heimdal_compile_et to avoid any confusion with e2fsprogs). I'll re-run my 4.10.x/4.11.x tests to see if that helps (won't be until later today).
Comment 20 Andrew Bartlett 2019-10-20 07:29:00 UTC
(In reply to Neil MacLeod from comment #19)
So compile_et comes from com_err which is as old as the hills and it seems e2fsprogs uses it as well. 

That is likely where you get the link symbol, and the rest likely comes from increasingly stricter warnings Samba is now forcing on.
Comment 21 Neil MacLeod 2019-10-20 07:32:58 UTC
(In reply to Andrew Bartlett from comment #20)

Great, that sounds promising as I can work around that my end by ensuring we use the Heimdal compile_et binary that I thought we were already using (by specifying COMPILE_ET env to Samba etc.)

I'll hopefully have better news later today!
Comment 22 Neil MacLeod 2019-10-20 19:49:26 UTC
(In reply to Andrew Bartlett from comment #20)

Wow, so sorry about leading you all on this wild goose chase.

The link error does indeed go away if building 4.10.x/4.11.x with compile_et from Heimdal rather than compile_et from e2fsprogs. Arggghhh...! :(

Many thanks for putting up with all my nonsense in this bug, I should have spotted we were using the "wrong" compile_et much sooner.

So, now, by adding COMPILE_ET/ASN1_COMPILE and using the Heimdal binaries, I'm able to successfully build 4.10.9 and 4.11.1 (although neither installs anything - due to bug #14132).

When building 4.10.9 I'm using only these two patches:

1. heimdal cross-compile fix[1]
2. cross-answers fix[2]

And when building 4.11.1 I'm using only these three patches:

1. heimdal cross-compile fix[1]
2. waf 2.0.18 update[3]
3. ASN1 fix[4]

If it would be possible for someone to take a look at bug #14132 I might be able to run-time test what I've now built! :)

Thanks again...

1. https://github.com/buildroot/buildroot/blob/5e968678fd83c2247f1a1ddd7434a9ce3a2019aa/package/samba4/0006-heimdal_build-wscript_build-do-not-add-host-include-.patch
2. https://github.com/buildroot/buildroot/blob/5e968678fd83c2247f1a1ddd7434a9ce3a2019aa/package/samba4/0004-cross_compile-fix.patch
3. https://bugzilla.samba.org/show_bug.cgi?id=13846#c23
4. https://bugzilla.samba.org/show_bug.cgi?id=14164#c6
Comment 23 Uri Simchoni 2019-10-21 08:24:33 UTC
(In reply to Neil MacLeod from comment #22)
Thank you so much for your efforts in evaluating testing!

The cross-answers fix has landed in master and will likely get into 4.11.next. Te heimdal /usr/include fix will also likely find its way into Upstream and get packported, I've contacted the original author of the patch. The asn1 patch may be here to stay for a while. The fact that we're using the most recent version of waf now makes it easier to fix it, because we can use the documentation, but it does require some work.

Regarding bug #14132, I haven't looked closely at it yet, but did notice that make install seems to be working OK with buildroot, perhaps you can compare.

Thanks,
Uri.
Comment 24 Neil MacLeod 2019-10-21 20:46:55 UTC
(In reply to Uri Simchoni from comment #23)
Hi Uri 

Thanks for the update - I'll keep an eye on the merges and hopefully I can drop patches over time.

Regarding bug#14132, I've taken a quick look at Buildroot[1] (apparently unaffected by bug#14132) and OpenWrt[2] (affected by bug#14132 - their Samba maintainer opened the bug).

The obvious difference is that Buildroot appears to be using "make install" while OpenWrt (and also LibreELEC) are using "buildtools/bin/waf install".

It's possible that "make install" is working fine, but "buildtools/bin/waf install" is somehow broken, which would explain why only OpenWrt and LibreELEC are affected.

For now, I've added a (hopefully temporary) manual installation step to LibreELEC[3] which has allowed me to run-time test 4.11.1 - which has proven to be almost flawless (only bug#14166 so far!) :)

Probably best to continue the discussion in bug#14132 - if you have anything you'd like me to test please give me a shout.

1. https://github.com/buildroot/buildroot/blob/77ffd39c31aa06ce77dbe71420db626b5c2da2fd/package/samba4/samba4.mk#L145-L151
2. https://github.com/openwrt/packages/blob/299e5b0a9bce19d6e96cb9ff217028b36ee2dd36/net/samba4/Makefile#L382-L390
3. https://github.com/LibreELEC/LibreELEC.tv/commit/dca13bb5d869cbcca9b41231255d41b409c283bd
Comment 25 Neil MacLeod 2019-10-22 00:17:02 UTC
(In reply to Uri Simchoni from comment #23)

bug#14133 may also be relevant, as I see something similar, and explains why OpenWrt don't perform a "buildtools/bin/waf build" step for target.
Comment 26 Uri Simchoni 2019-12-01 18:53:26 UTC
Created attachment 15656 [details]
git-am fix for 4.11.next and 4.10.next

Added patch for 4.10.x and 4.11.x.
Comment 27 Karolin Seeger 2019-12-03 11:38:40 UTC
(In reply to Uri Simchoni from comment #26)
Pushed to autobuild-v4-{11,10}-test.
Comment 28 andieq 2019-12-04 09:59:03 UTC
(In reply to Neil MacLeod from comment #25)

I also just added our version of the compile_et fix, since com_err and e2fsprogs also interfered with our krb5.

I just added a suffix and patched the 'check_system_heimdal_binary()' call, while statically add the com_err lib, so it can't break stuff that links against our default from e2fsprogs com_err.
See: https://github.com/openwrt/packages/pull/10701/files

Regarding bug#14132, this seems related to 'waf install' while also using --targets. I may end-up doing the same LibreELEC does and manually copy the symlinked libs/bins from the 'build_dir/bin', since it seems this works fine for LibreELEC?
Comment 29 Neil MacLeod 2019-12-04 10:16:08 UTC
(In reply to andieq from comment #28)

I haven't yet tested the latest patches now that they've landed in test (I will do, soon), but specifying the custom name in COMPILE_ET and ASN1_COMPILE vars is working just fine for LibreELEC[1,2] (and avoids patching Samba).

I've lashed together a temporary solution to work around the waf install issue[3], it's also working fine (but not ideal, obviously).

1. https://github.com/LibreELEC/LibreELEC.tv/blob/35878b552038aa4bcfd92105d1f270c4bd2b76c5/packages/devel/heimdal/package.mk#L33-L34
2. https://github.com/LibreELEC/LibreELEC.tv/blob/35878b552038aa4bcfd92105d1f270c4bd2b76c5/packages/network/samba/package.mk#L110-L111
3. https://github.com/LibreELEC/LibreELEC.tv/blob/35878b552038aa4bcfd92105d1f270c4bd2b76c5/packages/network/samba/package.mk#L127-L153
Comment 30 andieq 2019-12-04 11:55:36 UTC
(In reply to Neil MacLeod from comment #29)
export COMPILE_ET=$TOOLCHAIN/bin/heimdal_compile_et
export ASN1_COMPILE=$TOOLCHAIN/bin/heimdal_asn1_compile

Ah nice, that's cleaner than what i did, slipped my mind that this could work too.
Comment 31 Karolin Seeger 2019-12-17 12:24:20 UTC
(In reply to Karolin Seeger from comment #27)
Pushed to both branches.
Comment 32 Karolin Seeger 2019-12-17 12:27:18 UTC
Re-assigning to Andrew (I am not sure if it's really fixed now or not).
Comment 33 andieq 2019-12-17 13:36:39 UTC
(In reply to Karolin Seeger from comment #32)

Its not just updated my PR and i did still need the patch with 4.11.4:
https://github.com/openwrt/packages/pull/10781/files#diff-b4dbcf9880d43a6ea4957114a0eacfa1
Comment 34 Neil MacLeod 2019-12-17 16:21:01 UTC
(In reply to andieq from comment #33)

Isn't that patch for issue https://bugzilla.samba.org/show_bug.cgi?id=14164 (still required by 4.11.4)

4.11.4 fixes:

https://bugzilla.samba.org/show_bug.cgi?id=13846 and
https://bugzilla.samba.org/show_bug.cgi?id=13856 (this issue)

for me.
Comment 35 Uri Simchoni 2019-12-17 18:17:04 UTC
(In reply to andieq from comment #33)

As I see things, this bug is not about a specific issue, namely the /usr/include/heimdal that pollutes cross-build. I think it's the bug reporter's intent, and I think it's also the right way to move forward - the question to answer when considering whether a build-related bug is fixed is whether it knocks down by at least one the number of patches the builder has to apply.

In the case of OpenWRT, I don't see this patch (removal of explicit /usr/include/heimdal) applied at all before or after your openwrt pull request so I don't see how this bug applies to openwrt.

I am aware there are other issues and I think all of them are tracked by Samba bugs (The other Heimdal-specific issue is tracked by bug #14164).

Thanks,
Uri.
Comment 36 Uri Simchoni 2019-12-17 18:18:28 UTC
(In reply to Uri Simchoni from comment #35)
Arrrrrggghhh, meant to say this bug *is* about a specific issue...
Comment 37 Uri Simchoni 2019-12-19 09:22:14 UTC
Closing this bug report after:
- reporter acks it is fixed
- answering the concern in comment #33 that it's not fixed for OpenWRT (the OpenWRT issues are tracked by other Samba bugs)