Bug 13856 - Samba 4.10.0 cross-compile issue when compiling Heimdal
Samba 4.10.0 cross-compile issue when compiling Heimdal
Status: NEW
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Build
4.10.0
All All
: P5 normal
: ---
Assigned To: Samba QA Contact
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-03-22 12:57 UTC by Neil MacLeod
Modified: 2019-10-22 00:17 UTC (History)
5 users (show)

See Also:


Attachments
4.9.5 configuration (5.92 KB, text/plain)
2019-04-03 00:29 UTC, Neil MacLeod
no flags Details
4.10.0 configuration (5.95 KB, text/plain)
2019-04-03 00:29 UTC, Neil MacLeod
no flags Details
4.9.5 build log (Successful) (551.38 KB, text/plain)
2019-04-03 00:31 UTC, Neil MacLeod
no flags Details
4.10.0 build log (Failure) (250.24 KB, text/plain)
2019-04-03 00:31 UTC, Neil MacLeod
no flags Details
Samba 4.10.6 build failure (248.92 KB, application/octet-stream)
2019-07-09 19:15 UTC, Neil MacLeod
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Neil MacLeod 2019-03-22 12:57:03 UTC
We build Samba for LibreELEC, a custom Linux distribution, cross-compiling with our own gcc-8.3.0 toolchain.

We patch gcc to detect incorrect build host include usage when building for the target[1].

We are able to build Samba 4.9.5 with Heimdal 7.5.0[2] without issue (successful build log[3]) - all is good.

However, we are seeing a cross-compile problem with Samba 4.10.0[4], which now triggers a cross-compilation failure when building the "embedded" Heimdal.

Specifically, /usr/include/heimdal is being used by the Samba 4.10.0 build process which is incorrect when cross-compiling, and this did not happen with Samba 4.9.5.

This is the Samba 4.9.5 configuration: http://ix.io/1DXj

And this is the Samba 4.10.0 configuration: http://ix.io/1DXi (I'm including the fix for https://bugzilla.samba.org/show_bug.cgi?id=13844)

Any ideas why this cross-compile problem happens with Samba 4.10.0 but does NOT happen with Samba 4.9.5?

Note: The "link" error and truncation warnings may also be of interest - the truncation warning is also present in 4.9.5 (search for "_heim_time2generalizedtime").

1. https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/gcc/patches/gcc-crosscompile-badness.patch
2. https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/devel/heimdal/package.mk
3. http://ix.io/1E26
4. http://ix.io/1E1W
Comment 1 David Disseldorp 2019-03-22 13:22:15 UTC
As discussed on IRC, the explicit include path added via source4/heimdal_build/wscript_build appears to be responsible for the failure, but I don't know why it wasn't causing issues with the 4.9 build.

I'm also a little confused as to why heimdal is even built, given the --without-winbind --without-ads --without-ad-dc configure invocation.
Comment 2 Andrew Bartlett 2019-04-03 00:12:58 UTC
(In reply to David Disseldorp from comment #1)
We still use the internal heimdal unless --with-system-mitkrb5 or --with-system-heimdalkrb5

This specifically allowed us to upgrade our minimum kerberos requirement because we could fall back on the internal one (a bit like the way we use third_party).
Comment 3 Neil MacLeod 2019-04-03 00:29:19 UTC
Created attachment 15036 [details]
4.9.5 configuration
Comment 4 Neil MacLeod 2019-04-03 00:29:51 UTC
Created attachment 15037 [details]
4.10.0 configuration
Comment 5 Neil MacLeod 2019-04-03 00:31:21 UTC
Created attachment 15038 [details]
4.9.5 build log (Successful)
Comment 6 Neil MacLeod 2019-04-03 00:31:50 UTC
Created attachment 15039 [details]
4.10.0 build log (Failure)
Comment 7 Neil MacLeod 2019-07-09 19:15:08 UTC
Created attachment 15296 [details]
Samba 4.10.6 build failure

This is still a problem when building 4.10.6.

4.9.11 builds fine.
Comment 8 andieq 2019-07-10 13:20:32 UTC
(In reply to Neil MacLeod from comment #7)

You need this patch for musl as well: https://github.com/Andy2244/openwrt-extra/blob/master/samba4/patches/006-samba-4-10-musl_rm_unistd_incl.patch

Just got 4.10.6 working for openwrt, so maybe compare our version/patches with your LibreELEC version.
Comment 9 Neil MacLeod 2019-07-10 17:21:06 UTC
Hi @andieq, many thanks for the suggestion but unfortunately it had no effect - Samba 4.10.x continues to fail as before. To be honest I wasn't expecting it to fix this issue as we build with glibc.

This is my Samba 4.10 development branch:

https://github.com/LibreELEC/LibreELEC.tv/compare/master...MilhouseVH:le10_samba-4.10.0

There's not much change from 4.9.11, but Samba 4.10.x no longer cross-compiles due to the WAF build system pulling in the host includes rather than from the sysroot. Note that this failure is a custom modification to gcc so you may not see it in your build system:

https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/gcc/patches/gcc-crosscompile-badness.patch

We are using an external Heimdal (heimdal-7.7), but that hasn't changed from 4.9.x.

I don't think additional cross-answers are required by 4.10, the cross-answers file I'm using is here:

https://github.com/MilhouseVH/LibreELEC.tv/blob/17eb8c954d1742e7b89e4bdb36e8714af7f34156/packages/network/samba/config/samba4-cache.txt

Looking at your repo it appears you didn't need to make any cross-answer modifications.
Comment 10 andieq 2019-07-11 08:15:22 UTC
(In reply to Neil MacLeod from comment #9)

Ah just noticed the "link" redefinition, which i also got on musl, so had to grab the Alpine patch for it. Was assuming you guys also use musl, sorry this dind't work for you.

Yeah no cross-answer changes, i needed the 2 waf cross-compile patches, the gnutls patch and the musl one. Other than that "waf install --destdir" wasn't working anymore, so now have to manually grab everything from /bin....

I also could not get the AD_DC version building, because it pulls in mixed lib's (target/host) on it trying to build the "python embedded interpreter". Something is different from python2 to 3....
Comment 11 Neil MacLeod 2019-10-19 14:59:32 UTC
The Heimdal patch from buildroot[1] is an effective workaround for this bug when cross-compiling, and allows Samba 4.11.1 to cross-compile successfully (at least on x86_64 -> x86_64, due to bug #14164). This is only a workaround, however, and probably not an ideal fix as presumably it may cause issues when not cross-compiling.

1. https://github.com/buildroot/buildroot/blob/8b11b96f41a6ffa76556c9bf03a863955871ee57/package/samba4/0006-heimdal_build-wscript_build-do-not-add-host-include-.patch
Comment 12 Andrew Bartlett 2019-10-19 19:26:24 UTC
(In reply to Neil MacLeod from comment #11)
That patch looks correct.

I disagree with Jelmer and the patch 257e259a2603 that introduced this.

While we may wish to build against a system Heimdal, in that case we should not be building asn1_compile.  Building against a system roken but not a full Heimdal is not a supported use case. 

Can someone submit this a Merge Request?

Thanks!

Andrew Bartlett
Comment 13 Neil MacLeod 2019-10-19 23:58:56 UTC
Thanks Andrew.

There's another buildroot patch which might be of interest as it relates to this bug as well:

https://github.com/buildroot/buildroot/blob/593a60f7f0fa1489175700c7b2eda0666347faba/package/samba4/0005-fix_unistd_incl.patch

It fixes the "'link' redeclared as different kind of symbol" error mentioned in the first comment.

source4/heimdal/lib/asn1/asn1_err.c:47:23: error: 'link' redeclared as different kind of symbol
   47 | static struct et_list link = { 0, 0 };
      |                       ^~~~
In file included from ../../lib/replace/../replace/replace.h:172,
                 from ../../source4/heimdal_build/config.h:10,
                 from source4/heimdal/lib/asn1/asn1_err.c:1:
/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/toolchain/armv7ve-libreelec-linux-gnueabi/sysroot/usr/include/unistd.h:789:12: note: previous declaration of 'link' was here
  789 | extern int link (const char *__from, const char *__to)
      |            ^~~~

Waf: Leaving directory `/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/samba-4.11.1/bin/default'
Build failed
 -> task in 'HEIMDAL_HEIM_ASN1' failed with exit status 1:
        {task 76480912: c asn1_err.c -> asn1_err.c.60.o}


^^^ above is from 4.11.1.
Comment 14 Andrew Bartlett 2019-10-20 00:48:24 UTC
(In reply to Neil MacLeod from comment #13)
I can't see that symbol in a current master build without cross-compile or using the host heimdal for compile_et. 

Are you sure this isn't introduced by fixing bug 14164 the wrong way?
Comment 15 Neil MacLeod 2019-10-20 01:15:32 UTC
I can't really tell you where the issue is coming from, only that it's present in 4.10.x, 4.11.x and also 4.12.0 (master), and the buildroot patch I mentioned in c#13 is essential. All these branches will fail when cross-compiling without the buildroot fix.

This is not an issue in 4.9.x.

Just to be clear: I'm cross-compiling to arm on x86_64, with heimdal-7.7.0 providing the x86_64 asn1_compile and compile_et binaries.
Comment 16 Uri Simchoni 2019-10-20 05:47:21 UTC
(In reply to Neil MacLeod from comment #13)
This patch doesn't look right, because it removes unistd.h inclusion on "some" systems and not in others. It also already creates issues with master (you don't see them yet as you're building 4.11 / 4.10).

Maybe to get to the bottom of this issue, we need to understand on which toolchain it appears. The patch originates as a musl patch - are you using musl? If so, maybe the correct thing would be to open a bug on building with musl. Given enough time I'll also get there (need to try my glibc-based buildroot build without this patch, and if successful, try musl, see if the build breaks, and figure it out then).
Comment 17 Neil MacLeod 2019-10-20 05:56:38 UTC
(In reply to Uri Simchoni from comment #16)

My target toolchain is based on glibc-2.30, gcc-9.20, kernel 5.3.5.

Musl is not used.

LibreELEC is the distribution I'm building (for x86_64 and arm).
Comment 18 Andrew Bartlett 2019-10-20 06:00:37 UTC
(In reply to Neil MacLeod from comment #17)
I'm quite sure the 'link' issue is from using a non-Samba compile_et.

I'm sceptical that it is from Heimdal, it isn't in Heimdal 7.7.0's compile_et, but is in that old MIT one.
Comment 19 Neil MacLeod 2019-10-20 07:17:16 UTC
We're only building Heimdal, there's no MIT (unless it's installed as standard on Ubuntu 16.04 and is leaking over from the host) - it's certainly not a part of the LibreELEC build.

It looks like e2fsprogs is installing a shell script named compile_et, and we're not actually installing the compile_et binary from Heimdal - which has surprised me (sorry about that)... but we are definitely installing the asn1_compile binary from Heimdal.

I'm not sure if the e2fsprogs compile_et shell script is the source of the "'link' redefined'" error, could it be...? Why wouldn't we get this with 4.9.13?

I can switch to using the COMPILE_ET / ASN1_COMPILE environment variables to specify the exact locations of the Heimdal compile_et/asn1_compile binaries (I'll call it heimdal_compile_et to avoid any confusion with e2fsprogs). I'll re-run my 4.10.x/4.11.x tests to see if that helps (won't be until later today).
Comment 20 Andrew Bartlett 2019-10-20 07:29:00 UTC
(In reply to Neil MacLeod from comment #19)
So compile_et comes from com_err which is as old as the hills and it seems e2fsprogs uses it as well. 

That is likely where you get the link symbol, and the rest likely comes from increasingly stricter warnings Samba is now forcing on.
Comment 21 Neil MacLeod 2019-10-20 07:32:58 UTC
(In reply to Andrew Bartlett from comment #20)

Great, that sounds promising as I can work around that my end by ensuring we use the Heimdal compile_et binary that I thought we were already using (by specifying COMPILE_ET env to Samba etc.)

I'll hopefully have better news later today!
Comment 22 Neil MacLeod 2019-10-20 19:49:26 UTC
(In reply to Andrew Bartlett from comment #20)

Wow, so sorry about leading you all on this wild goose chase.

The link error does indeed go away if building 4.10.x/4.11.x with compile_et from Heimdal rather than compile_et from e2fsprogs. Arggghhh...! :(

Many thanks for putting up with all my nonsense in this bug, I should have spotted we were using the "wrong" compile_et much sooner.

So, now, by adding COMPILE_ET/ASN1_COMPILE and using the Heimdal binaries, I'm able to successfully build 4.10.9 and 4.11.1 (although neither installs anything - due to bug #14132).

When building 4.10.9 I'm using only these two patches:

1. heimdal cross-compile fix[1]
2. cross-answers fix[2]

And when building 4.11.1 I'm using only these three patches:

1. heimdal cross-compile fix[1]
2. waf 2.0.18 update[3]
3. ASN1 fix[4]

If it would be possible for someone to take a look at bug #14132 I might be able to run-time test what I've now built! :)

Thanks again...

1. https://github.com/buildroot/buildroot/blob/5e968678fd83c2247f1a1ddd7434a9ce3a2019aa/package/samba4/0006-heimdal_build-wscript_build-do-not-add-host-include-.patch
2. https://github.com/buildroot/buildroot/blob/5e968678fd83c2247f1a1ddd7434a9ce3a2019aa/package/samba4/0004-cross_compile-fix.patch
3. https://bugzilla.samba.org/show_bug.cgi?id=13846#c23
4. https://bugzilla.samba.org/show_bug.cgi?id=14164#c6
Comment 23 Uri Simchoni 2019-10-21 08:24:33 UTC
(In reply to Neil MacLeod from comment #22)
Thank you so much for your efforts in evaluating testing!

The cross-answers fix has landed in master and will likely get into 4.11.next. Te heimdal /usr/include fix will also likely find its way into Upstream and get packported, I've contacted the original author of the patch. The asn1 patch may be here to stay for a while. The fact that we're using the most recent version of waf now makes it easier to fix it, because we can use the documentation, but it does require some work.

Regarding bug #14132, I haven't looked closely at it yet, but did notice that make install seems to be working OK with buildroot, perhaps you can compare.

Thanks,
Uri.
Comment 24 Neil MacLeod 2019-10-21 20:46:55 UTC
(In reply to Uri Simchoni from comment #23)
Hi Uri 

Thanks for the update - I'll keep an eye on the merges and hopefully I can drop patches over time.

Regarding bug#14132, I've taken a quick look at Buildroot[1] (apparently unaffected by bug#14132) and OpenWrt[2] (affected by bug#14132 - their Samba maintainer opened the bug).

The obvious difference is that Buildroot appears to be using "make install" while OpenWrt (and also LibreELEC) are using "buildtools/bin/waf install".

It's possible that "make install" is working fine, but "buildtools/bin/waf install" is somehow broken, which would explain why only OpenWrt and LibreELEC are affected.

For now, I've added a (hopefully temporary) manual installation step to LibreELEC[3] which has allowed me to run-time test 4.11.1 - which has proven to be almost flawless (only bug#14166 so far!) :)

Probably best to continue the discussion in bug#14132 - if you have anything you'd like me to test please give me a shout.

1. https://github.com/buildroot/buildroot/blob/77ffd39c31aa06ce77dbe71420db626b5c2da2fd/package/samba4/samba4.mk#L145-L151
2. https://github.com/openwrt/packages/blob/299e5b0a9bce19d6e96cb9ff217028b36ee2dd36/net/samba4/Makefile#L382-L390
3. https://github.com/LibreELEC/LibreELEC.tv/commit/dca13bb5d869cbcca9b41231255d41b409c283bd
Comment 25 Neil MacLeod 2019-10-22 00:17:02 UTC
(In reply to Uri Simchoni from comment #23)

bug#14133 may also be relevant, as I see something similar, and explains why OpenWrt don't perform a "buildtools/bin/waf build" step for target.