Bug 10112 - Use -R linker flag on Solaris, not -rpath (waf needs better compiler AND linker recognition)
Summary: Use -R linker flag on Solaris, not -rpath (waf needs better compiler AND link...
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: Build (show other bugs)
Version: unspecified
Hardware: All All
: P5 regression (vote)
Target Milestone: ---
Assignee: Björn Jacke
QA Contact: Samba QA Contact
URL:
Keywords:
: 10456 (view as bug list)
Depends on: 11073
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-24 10:54 UTC by Ralph Böhme
Modified: 2015-03-13 08:30 UTC (History)
6 users (show)

See Also:


Attachments
Use -R linker flag on Solaris (1.38 KB, patch)
2013-08-24 10:54 UTC, Ralph Böhme
bjacke: review-
Details
User -Wl,-R instead of -R (1.36 KB, patch)
2013-09-14 15:55 UTC, Ralph Böhme
bjacke: review-
Details
Patch that checks for correct linker flags for rpath (954 bytes, patch)
2014-12-18 17:26 UTC, Ralph Böhme
no flags Details
Patch for v4-2-test (part1) (2.97 KB, patch)
2014-12-23 22:28 UTC, Stefan Metzmacher
no flags Details
Patches for v4-2-test (part1) (4.26 KB, patch)
2014-12-24 09:41 UTC, Stefan Metzmacher
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralph Böhme 2013-08-24 10:54:57 UTC
Created attachment 9160 [details]
Use -R linker flag on Solaris

Chasing a strange bug for several days where wbinfo -i USER would fail to return the user attributes of Active Directory users coming from Win 2008r2 via idmap_ad on a Samba 4 member server.

wbinfo -u lists AD users just find, but wbinfo -i fails to show the user accounts attributes.

$ /opt/samba/bin/wbinfo -u
administrator
gast
krbtgt
aduser
$ /opt/samba/bin/wbinfo -i aduser
failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user aduser

Same setup on Fedora works:

[ralph@fedora ~]$ /opt/samba/bin/wbinfo -i aduser
aduser:*:10000:10001:AD User:/home/aduser:/bin/sh

After banging my head against several walls, when trussing winbindd I saw the following error:

1210:    4.1357  0.0002 write(2, 0xFEFFCF5C, 137)                       = 137
1210:      l d . s o . 1 :   w i n b i n d d :   f a t a l :   r e l o c a
1210:      t i o n   e r r o r :   f i l e   / o p t / s a m b a / s b i n
1210:      / w i n b i n d d :   s y m b o l   i d m a p _ f i n d _ d o m
1210:      a i n _ w i t h _ s i d :   r e f e r e n c e d   s y m b o l  
1210:      n o t   f o u n d

Huh?! idmap_find_domain_with_sid() should come from PREFIX/lib/private/libidmap.so, but instead of that library the system libidmap.so is linked in:

$ ldd /opt/samba/sbin/winbindd | grep idmap
        libidmap.so =>   /usr/lib/libidmap.so

Looking at the final link command (CC=gcc) I noticed the linker flag "-rpath" was being used. Switching that for -R fixes the link error and now wbinfo -i works too:

$ /opt/samba/bin/wbinfo -i aduser
aduser:*:10000:10001:AD User:/home/aduser:/bin/sh

Patch modifying RPATH_ST for gcc and suncc on Solaris attached and also available at:
<https://github.com/slowfranklin/samba/compare/master...rpath>
Comment 1 Björn Jacke 2013-08-29 07:40:55 UTC
the patch is not right imho. Have a look at the autoconf tests of us.

if $CC -Wl,-v /dev/null 2>&1 </dev/null | egrep '(GNU|with BFD)' 1>&5; then
  ac_cv_prog_gnu_ld=yes
else
  ac_cv_prog_gnu_ld=no
fi

We check which linker the compiler calls. GCC might use a non GNU-ld and a non-GCC might also call a native ld.
Comment 2 Ralph Böhme 2013-08-29 08:49:09 UTC
(In reply to comment #1)
> the patch is not right imho. Have a look at the autoconf tests of us.

yes, that's originating from AC_PROG_LD_GNU. Afaict there's no corresponding check for the linker type in waf. Adding such a check is beyond my Python/waf capabilities.

Also, all version of gcc actually in use on Solaris development system use the Solaris linker:

slowfranklin@solaris:~$ which gcc
/opt/csw/bin/gcc
slowfranklin@solaris:~$ gcc --version | grep gcc
gcc (GCC) 4.8.0
slowfranklin@solaris:~$ `gcc -print-prog-name=ld` --version
ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.2324

slowfranklin@solaris:~$ /usr/ccs/bin/gcc --version | grep gcc
gcc (GCC) 4.5.2
slowfranklin@solaris:~$ `/usr/ccs/bin/gcc  -print-prog-name=ld` --version
ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.2324
slowfranklin@solaris:~$

So without the patch the build is broken on 100-x% of Solaris systems. With the patch its broken on x%. Afaict x is <<50, so it's a win anyway. :)
Comment 3 Andrew Bartlett 2013-09-10 21:14:08 UTC
Thanks for the I saw this when I was last working on Solaris!  I wondered what the issue was!
Comment 4 Ira Cooper 2013-09-13 17:18:39 UTC
I suspect this issue is more than listed at the moment.

I applied the patch and attempted a build on master, and I can't complete build, due to this issue.

./buildtools/bin/waf -v:

[3571/3706] Linking default/source4/cldap_server/libservice-cldap.so
16:38:09 runner gcc -m64 -O2 -g -L/opt/local/lib -R/opt/local/lib default/source4/cldap_server/netlogon_3.o default/source4/cldap_server/rootdse_3.o default/source4/cldap_server/cldap_server_1.o -o /root/samba/bin/default/source4/cldap_server/libservice-cldap.so -fstack-protector -lpthread -shared -L/home/pbulk/build/2013Q2-x86_64/lang/python27/work/Python-2.7.5 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3 -Wl,-R/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3 -L/opt/local/gcc47/lib -Wl,-R/opt/local/gcc47/lib -L/usr/lib/amd64 -Wl,-R/usr/lib/amd64 -L/opt/local/lib -Wl,-R/opt/local/lib -R/root/samba/bin/shared -R/root/samba/bin/shared/private -Ldefault/lib/ccan -Ldefault/lib/ntdb -Ldefault/auth -Ldefault/nsswitch -Ldefault/libcli/ldap -Ldefault/libds/common -Ldefault/libcli/smb -Ldefault/auth/gensec -Ldefault/source4/libcli -Ldefault/libcli/nbt -Ldefault/lib/addns -Ldefault/source4/lib/events -Ldefault/lib/dbwrap -Ldefault/lib/tdb_wrap -Ldefault/lib/krb5_wrap -Ldefault/source4/auth/kerberos -Ldefault/source4/auth -Ldefault/source4/libcli/wbclient -Ldefault/libcli/auth -Ldefault/nsswitch/libwbclient -Ldefault/lib/replace -Ldefault/lib/talloc -Ldefault/source4/libcli/ldap -Ldefault/source4/cluster -Ldefault/libcli/security -Ldefault/libcli/util -Ldefault/lib/tdb -Ldefault/source4/dsdb -Ldefault/lib/ldb -Ldefault/lib -Ldefault/lib/tevent -Ldefault/libcli/named_pipe_auth -Ldefault/source4/lib/messaging -Ldefault/librpc -Ldefault/auth/credentials -Ldefault/source4/auth/ntlm -Ldefault/source4/heimdal_build -Ldefault/source4/librpc -Ldefault/lib/socket -Ldefault/lib/param -Ldefault/lib/util -Ldefault/libcli/cldap -Ldefault/lib/ldb-samba -Ldefault/source4/smbd -Ldefault/source4/lib/socket -L/usr/local/lib -Wl,-Bdynamic -lnetif -lservice -lldbsamba -lprocess_model -lcli_cldap -lsamba-util -lsamba-hostconfig -linterfaces -lndr-samba4 -lgssapi-samba4 -ltevent-util -lauth4 -lsamba-credentials -lndr-samba -lMESSAGING -lnpa_tstream -ltevent -ldcerpc -lsamba-sockets -lldb -lsamdb-common -ltdb -lpyldb-util -lerrors -lndr -lsamba-security -lsamba-modules -lcluster -lcli-ldap -lndr-nbt -lutil_setid -ltalloc -lreplace -lserver-role -lndr-standard -lndr-krb5pac -lkrb5-samba4 -lroken-samba4 -lasn1-samba4 -lhcrypto-samba4 -lcom_err-samba4 -lwind-samba4 -lwbclient -lsamdb -lcliauth -lLIBWBCLIENT_OLD -lauth_unix_token -ldcerpc-samba4 -lauthkrb5 -lkrb5samba -ltdb-wrap -ldbwrap -lutil_tdb -levents -lasn1util -laddns -lcli-nbt -lsmbclient-raw -ldcerpc-binding -lgensec -lcli_smb_common -ldcerpc-samba -lflag_mapping -lcli-ldap-common -lheimbase-samba4 -lhx509-samba4 -lwinbind-client -lauth_sam_reply -lutil_ntdb -lsmb_transport -lntdb -lccan -lz -lresolv -lsasl2 -lm -lsocket -lnsl -ldl -lrt -lpython2.7 -lpam -liconv -lmd5
Undefined                       first referenced
 symbol                             in file
idmap_tdb_common_sids_to_unixids    default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_get_new_id         default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_unixid_to_sid      default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_sid_to_unixid      default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_unixids_to_sids    default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_new_mapping        default/source3/torture/test_idmap_tdb_common_154.o
idmap_tdb_common_set_mapping        default/source3/torture/test_idmap_tdb_common_154.o
ld: fatal: symbol referencing errors. No output written to /root/samba/bin/default/source3/smbtorture3

In reading the output, we have a major issue.  /usr/lib can't come before our actual compile directories, and our prefix.  That will cause problems.

I haven't looked to see how deep this issue runs yet.  But -1 on the patch, it may be part of an overall answer, but alone it will not cut it.

Can I get uname -a off your machine Ralph?  I actually use illumos for my main platform, so if you are using "Oracle Solaris 11" I'd like to have you test whatever I come up with.

Thanks,
Comment 5 Ira Cooper 2013-09-13 17:20:16 UTC
And I meant /usr/lib/amd64, not /usr/lib here.
Comment 6 Ralph Böhme 2013-09-14 15:54:22 UTC
Ira,

thanks for looking into this!

(In reply to comment #4)
> I suspect this issue is more than listed at the moment.
> 
> I applied the patch and attempted a build on master, and I can't complete
> build, due to this issue.
> 
> ./buildtools/bin/waf -v:
> 
> [3571/3706] Linking default/source4/cldap_server/libservice-cldap.so
> ...
> Undefined                       first referenced
>  symbol                             in file
> idmap_tdb_common_sids_to_unixids   
> default/source3/torture/test_idmap_tdb_common_154.o
> ...

hm, can't reproduce git current master:

$ uname -a
SunOS solaris 5.11 11.1 i86pc i386 i86pc
$ cat ../configure-samba
#!/bin/sh
./configure \
    --prefix=/opt/samba \
    --with-ads \
    --with-acl-support \
    --enable-selftest
$ make clean && ../configure-samba && make
$ rm bin/default/source4/cldap_server/libservice-cldap.so 
$ ./buildtools/bin/waf --target=default/source4/cldap_server/libservice-cldap.so -v
./buildtools/wafsamba/samba_utils.py:397: DeprecationWarning: the md5 module is deprecated; use hashlib instead
  import md5
Waf: Entering directory `/home/ralph/samba/bin'
    Selected embedded Heimdal build
[  43/3824] Generating smbd/build_options.c
[3594/3824] Linking default/source4/cldap_server/libservice-cldap.so
17:44:48 runner /usr/bin/gcc default/source4/cldap_server/netlogon_3.o default/source4/cldap_server/rootdse_3.o default/source4/cldap_server/cldap_server_1.o -o /home/ralph/samba/bin/default/source4/cldap_server/libservice-cldap.so -fstack-protector -lpthread -shared -Wl,-zignore -Wl,-zcombreloc -Wl,-Bdirect -L. -L/usr/gnu/lib -Wl,-R/home/ralph/samba/bin/shared -Wl,-R/home/ralph/samba/bin/shared/private -Ldefault/lib/ccan -Ldefault/lib/ntdb -Ldefault/nsswitch -Ldefault/auth -Ldefault/libcli/ldap -Ldefault/libds/common -Ldefault/source4/auth -Ldefault/source4/libcli/wbclient -Ldefault/nsswitch/libwbclient -Ldefault/lib/dbwrap -Ldefault/source4/auth/kerberos -Ldefault/lib/tdb_wrap -Ldefault/lib/krb5_wrap -Ldefault/libcli/smb -Ldefault/auth/gensec -Ldefault/source4/lib/events -Ldefault/libcli/auth -Ldefault/source4/libcli -Ldefault/libcli/nbt -Ldefault/lib/addns -Ldefault/source4/libcli/ldap -Ldefault/source4/cluster -Ldefault/libcli/util -Ldefault/libcli/security -Ldefault/lib/tdb -Ldefault/source4/dsdb -Ldefault/lib/ldb -Ldefault/lib/talloc -Ldefault/lib/replace -Ldefault/lib -Ldefault/libcli/named_pipe_auth -Ldefault/source4/auth/ntlm -Ldefault/lib/tevent -Ldefault/source4/lib/messaging -Ldefault/librpc -Ldefault/auth/credentials -Ldefault/source4/librpc -Ldefault/source4/heimdal_build -Ldefault/lib/param -Ldefault/lib/socket -Ldefault/lib/util -Ldefault/libcli/cldap -Ldefault/lib/nss_wrapper -Ldefault/lib/ldb-samba -Ldefault/lib/uid_wrapper -Ldefault/lib/socket_wrapper -Ldefault/source4/smbd -Ldefault/source4/lib/socket -L/usr/local/lib -L/usr/lib -Wl,-Bdynamic -lnetif -lservice -lsocket_wrapper -luid_wrapper -lldbsamba -lnss_wrapper -lprocess_model -lcli_cldap -lsamba-util -linterfaces -lsamba-hostconfig -lgssapi-samba4 -ltevent-util -ldcerpc -lsamba-credentials -lndr-samba -lMESSAGING -ltevent -lndr-samba4 -lauth4 -lnpa_tstream -lsamba-sockets -lreplace -lutil_setid -ltalloc -lldb -lsamdb-common -ltdb -lsamba-security -lerrors -lpyldb-util -lndr -lcluster -lsamba-modules -lcli-ldap -lndr-nbt -lserver-role -lkrb5-samba4 -lroken-samba4 -lcom_err-samba4 -lasn1-samba4 -lhcrypto-samba4 -lwind-samba4 -laddns -lcli-nbt -lsmbclient-raw -lcliauth -ldcerpc-binding -levents -lgensec -lcli_smb_common -ldcerpc-samba -lkrb5samba -ltdb-wrap -lauthkrb5 -ldbwrap -lutil_tdb -lasn1util -lndr-standard -lndr-krb5pac -lwbclient -lsamdb -lLIBWBCLIENT_OLD -lauth_unix_token -ldcerpc-samba4 -lflag_mapping -lcli-ldap-common -lheimbase-samba4 -lhx509-samba4 -lsmb_transport -lauth_sam_reply -lutil_ntdb -lwinbind-client -lntdb -lccan -lresolv -lm -lsocket -lnsl -ldl -lpython2.6 -lgcrypt -lgnutls -lz -lpam -lrt -lmd5
Waf: Leaving directory `/home/ralph/samba/bin'
'build' finished successfully (11.230s)
$ 

Library search paths order looks good, no -L/usr/lib/amd64.

Note that I'm using a slightly modified patch where I'm using the more correct form -Wl,-R instead of passing -R to gcc. The latter seems to work, but it's not documented anywhere.

> In reading the output, we have a major issue.  /usr/lib can't come before our
> actual compile directories, and our prefix.  That will cause problems.

Agree.

> I haven't looked to see how deep this issue runs yet.  But -1 on the patch, it
> may be part of an overall answer, but alone it will not cut it.

Afaict that's an unrelated issue. If the buildsystem somewhere sticks some library search path that ought not be there, then that's a different bug, isn't it?

> Can I get uname -a off your machine Ralph?

Please see above.

> I actually use illumos for my main platform, ...

lllumos is a kernel. Which distro? OI? Illumian? I have a OI VM so I can run tests there too.

> ... so if you are using "Oracle Solaris 11" I'd like to have you test
> whatever I come up with.

Just let me know if I can help with anything.
Comment 7 Ralph Böhme 2013-09-14 15:55:53 UTC
Created attachment 9215 [details]
User -Wl,-R instead of -R
Comment 8 Ira Cooper 2013-09-15 14:54:00 UTC
(In reply to comment #6)

> Library search paths order looks good, no -L/usr/lib/amd64.

I still think something smells...  See below.

> Note that I'm using a slightly modified patch where I'm using the more correct
> form -Wl,-R instead of passing -R to gcc. The latter seems to work, but it's
> not documented anywhere.

IMHO: That's probably pedantic, but fine :).

> > In reading the output, we have a major issue.  /usr/lib can't come before our
> > actual compile directories, and our prefix.  That will cause problems.
> 
> Agree.

In a sense, I can't detangle this from your issue if a -R/usr/lib/amd64 shows up in the build line before the -R for the samba prefix... it'll look like the the exact same bug.  I'm not saying your fix is right, or wrong.  I'm saying "I can't tell."

> > I haven't looked to see how deep this issue runs yet.  But -1 on the patch, it
> > may be part of an overall answer, but alone it will not cut it.
> 
> Afaict that's an unrelated issue. If the buildsystem somewhere sticks some
> library search path that ought not be there, then that's a different bug, isn't
> it?

Technically: Yes. Practically for me: No.  I can't close your bug w/o it closed.  And it has very similar symptoms.

> > Can I get uname -a off your machine Ralph?
> 
> Please see above.
> 
> > I actually use illumos for my main platform, ...
> 
> lllumos is a kernel. Which distro? OI? Illumian? I have a OI VM so I can run
> tests there too.

SmartOS, a custom build.

SunOS batfs9998-icdev6 5.11 joyent_20130828T175452Z i86pc i386 i86pc Solaris

But don't put any weight in the time/date stamp.  I rebuild as needed.  (And the joyent_ is wrong, don't blame Joyent for my work.)

> > ... so if you are using "Oracle Solaris 11" I'd like to have you test
> > whatever I come up with.
> 
> Just let me know if I can help with anything.

Will do.
Comment 9 Ira Cooper 2013-09-15 15:01:30 UTC
A general comment to all readers:

I see 4 combinations of compiler/linker to use on Solaris.  I see only 1 as valid.  Oracle agrees with me for Samba.

(SunPro/GCC) - GNU ld: Do not use gnu ld.  It will end in tears.  Everyone agrees the Solaris/Illumos ld is what must be used right now.

Sun Studio/native ld: Personally, I've seen too many bugs with Studio to recommend the compiler.  I've seen enough straight up wrong code, and broken things to be comfortable.  Yes, all compilers can make mistakes.  Just some do it more often than others.

GCC/native ld: This is the combination Oracle themselves use for building Samba, despite a strong bias for their own compiler.  Our code likely trips bugs, and side issues with their compiler.  Also we know our code is better tested under GCC in general.  (Though you could argue for testing using Studio is fine.  I likely wouldn't do it in production.)

So I am not testing Studio here.  I am in GCC/native ld land.
Comment 10 Björn Jacke 2013-09-16 07:56:40 UTC
even if I repeat myself: we need a waf configure test that checks which linker if being used by $CC -Wl. Also the updated patch doesn't do that. This check is needed also for other OS/compiler/linker combinations.

The missing of this check is a autoconf/waf build regression.
Comment 11 Björn Jacke 2013-09-16 07:57:07 UTC
Comment on attachment 9215 [details]
User -Wl,-R instead of -R

patch not okay
Comment 12 Björn Jacke 2014-02-20 10:45:14 UTC
*** Bug 10456 has been marked as a duplicate of this bug. ***
Comment 13 Ralph Böhme 2014-02-20 11:56:50 UTC
> "waf needs better compiler AND linker recognition"

agree, but until someone is able to implement that, why not use the proposed patch?

Wrt:
> User -Wl,-R instead of -R

Guess you're referring to this:

--- a/buildtools/wafadmin/Tools/suncc.py
+++ b/buildtools/wafadmin/Tools/suncc.py
@@ -48,6 +48,7 @@ def scc_common_flags(conf):
 	v['STATICLIB_ST']        = '-l%s'
 	v['STATICLIBPATH_ST']    = '-L%s'
 	v['CCDEFINES_ST']        = '-D%s'
+	v['RPATH_ST']            = '-R%s'
 
 	v['SONAME_ST']           = '-Wl,-h -Wl,%s'
 	v['SHLIB_MARKER']        = '-Bdynamic'

Passing -R directly is acually the correct thing to do with Studio:

$ /opt/solarisstudio12.3/bin/cc -flags | grep -- '-R'
-R<dir[:dir]>                 Build runtime search path list into executable
$
Comment 14 Björn Jacke 2014-02-20 12:47:16 UTC
a temporary hack will for sure result in the waf problem *never* being fixed. And a half hack we might break the compile for other people with other compiler/linker combinatinos for whom it works currently. And not only Solaris with the Studio/GNU compiler is affected by this. We *really* need the waf compiler/linker recognition get fixed :-). (Let's get back autobuild! ... just kidding ;)
Comment 15 Ira Cooper 2014-02-20 13:51:58 UTC
(In reply to comment #14)
> a temporary hack will for sure result in the waf problem *never* being fixed.
> And a half hack we might break the compile for other people with other
> compiler/linker combinatinos for whom it works currently. And not only Solaris
> with the Studio/GNU compiler is affected by this. We *really* need the waf
> compiler/linker recognition get fixed :-). (Let's get back autobuild! ... just
> kidding ;)

So there's 3 compiler setups one might see on Solaris/illumos.  2 are broken.

Broken:

1. SunPro.  (No, I don't care whomever's feelings I just hurt.  It is broken, broken, broken, broken!!  Not even Oracle uses it to build Samba!)
2. GCC/gnuld - Solaris/illumos should use the native linker.

Working:

3. GCC/system ld.  This is fine, and what we should support.

All other things I'd actually be ok, calling out and declaring broken in waf at this point, to save people the pain of trying to make them work.

Consider it a mercy killing.
Comment 16 Björn Jacke 2014-02-20 14:21:59 UTC
On 2014-02-20 at 13:51 +0000 samba-bugs@samba.org sent off:
> Broken:
> 
> 1. SunPro.  (No, I don't care whomever's feelings I just hurt.  It is broken,
> broken, broken, broken!!  Not even Oracle uses it to build Samba!)

I don't agree on this being called broken by design. I've helped a lot of
people make samba run on solaris. on some systems the studio compiler was the only
thing that worked without problem.

the same native vs. gnu compiler/linker issues exist not only on solaris, also
on HP-UX, IRIX, Tru64, AIX ... if someone puts any work in fixing this, then
please have a look at the compiler/linker tests we had in autobuild and port
the correct tests to waf. Anything else is fixing one setup and breaking
another. We had the same game in the old autobuild for a while, I don't want to
play this again :-)
Comment 17 Ira Cooper 2014-02-20 14:28:07 UTC
(In reply to comment #16)
> On 2014-02-20 at 13:51 +0000 samba-bugs@samba.org sent off:
> > Broken:
> > 
> > 1. SunPro.  (No, I don't care whomever's feelings I just hurt.  It is broken,
> > broken, broken, broken!!  Not even Oracle uses it to build Samba!)
> 
> I don't agree on this being called broken by design. I've helped a lot of
> people make samba run on solaris. on some systems the studio compiler was the
> only thing that worked without problem.

It is one of those things... it'll bite you at some point... just wait for it.

I don't trust SunPro, and that's after many years of using it.  GCC+NativeLD :).

> the same native vs. gnu compiler/linker issues exist not only on solaris, also
> on HP-UX, IRIX, Tru64, AIX ... if someone puts any work in fixing this, then
> please have a look at the compiler/linker tests we had in autobuild and port
> the correct tests to waf. Anything else is fixing one setup and breaking
> another. We had the same game in the old autobuild for a while, I don't want to
> play this again :-)

That seems smart.  Note, I'm not working on illumos as much anymore.  (If at all, I think I have 0 illumos VMs at the moment.)

But I stand by the assessment of SunPro, it is just begging trouble.

-Ira
Comment 18 Jura Sasek 2014-02-20 16:35:16 UTC
(In reply to comment #17)
> (In reply to comment #16)
<snip>
> I don't trust SunPro, and that's after many years of using it.  GCC+NativeLD
> :).
</snip>

My observations are "long bearded" (tests performed on *) a bit but I had ~30% less of Samba bandwith (transfer of ~1GB files) in comparison of gcc to Spro. On the latest T4's and T5's the results can be even worse because of crupto intruction set. SPARC is a RISC and the performance strongly depends on optimization.

Other point of view is using of the gcc in Solaris needs approval  ...i.e. Samba on i86pc/amd64 is compiled by gcc 4.2 because the ads join was not working when compiled by Spro in Intel. I will evaluate but I did not finished it for the last several years yet  :-)

<snip> 
> That seems smart.  Note, I'm not working on illumos as much anymore.  (If at
> all, I think I have 0 illumos VMs at the moment.)
</snip>

SPARC is not a priority for Sun's OpenSolaris "orphans". They do not build their own sparc based HW.

(*) - Ultra5, s10 (development release b~65), Spro8, gcc 3.2
Comment 19 Tom Schulz 2014-02-22 17:40:47 UTC
From the Samba 3.6.22 configure, the code used to test for which linker is being used is:

#! /bin/sh
if $CC -Wl,-v /dev/null 2>&1 </dev/null | egrep '(GNU|with BFD)' 1>&5; then
  ac_cv_prog_gnu_ld=yes
else
  ac_cv_prog_gnu_ld=no
fi
Comment 20 Tom Schulz 2014-02-24 15:28:49 UTC
In my last comment I left out a comment in the test procedure that might be somewhat important, so here it is again with that comment included.

From the Samba 3.6.22 configure, the code used to test for which linker is
being used is:

#! /bin/sh
  # I'd rather use --version here, but apparently some GNU ld's only accept -v.
if $CC -Wl,-v /dev/null 2>&1 </dev/null | egrep '(GNU|with BFD)' 1>&5; then
  ac_cv_prog_gnu_ld=yes
else
  ac_cv_prog_gnu_ld=no
fi
Comment 21 Tom Schulz 2014-02-28 19:28:36 UTC
After a little more research, I found this in the gnu linker man page:

 For compatibility with other ELF linkers, if the -R option is
 followed by a directory name, rather than a file name, it is
 treated as the -rpath option.

This is from the man page from a linux system.
I tested this on Solaris by temporarily removing the Solaris linker and putting
the GNU linker in it's place. The -R option did set a run path as verified by
the 'ldd -s' command.

So, you do not need to test to find out which linker is in use. Just always
use -R to set the run path on Solaris.
Comment 22 Ralph Böhme 2014-12-18 10:36:37 UTC
(In reply to Björn Jacke from comment #14)
Fwiw: checking which linker is used doesn't buy us anything, because it would require putting additional explicit knowledge about which linker requires which link flags for the functionality in question.
We'd need a test that checks which linker flags actually work for passing and setting an rpath. So until we get that, we have a completely broken build on Solaris thanks to this bug.
Comment 23 Ralph Böhme 2014-12-18 10:40:14 UTC
(In reply to Ira Cooper from comment #4)
> I suspect this issue is more than listed at the moment.
> 
> I applied the patch and attempted a build on master, and I can't complete build, 
> due to this issue.
> ...
> In reading the output, we have a major issue.  /usr/lib can't come before our
> actual compile directories, and our prefix.  That will cause problems.

that's another bug:
<https://bugzilla.samba.org/show_bug.cgi?id=10877>
Comment 24 Björn Jacke 2014-12-18 11:28:27 UTC
(In reply to Ralph Böhme from comment #22)
> Fwiw: checking which linker is used doesn't buy us anything, because it would require putting additional explicit knowledge about which linker requires which link flags for the functionality in question.

of course the check for the linker is (also) based on functionality checks - as in the autobuild checks, which still need to be merged to waf :-)
Comment 25 Ralph Böhme 2014-12-18 11:39:44 UTC
(In reply to Björn Jacke from comment #24)
Which autobuild checks? There's no such check in 3.6 autoconf.
Comment 26 Björn Jacke 2014-12-18 12:09:31 UTC
> > of course the check for the linker is (also) based on functionality checks - as in the autobuild checks, which still need to be merged to waf :-)
> Which autobuild checks?

mainly the checks which linker is being used and which features they support are were here:

source3/configure.in  (search for "-Wl" for example)
lib/replace/libreplace_ld.m4

not talking about any specific check here. The whole magic to get the major things right for non-mainstream, non-gcc systems is still not there in waf.
Comment 27 Stefan Metzmacher 2014-12-18 12:40:12 UTC
(In reply to Björn Jacke from comment #26)

I think we can use one of the following two approaches:

1.)

 buildtools/wafsamba/wscript | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/buildtools/wafsamba/wscript b/buildtools/wafsamba/wscript
index 1a2cfe6..4acc79b 100755
--- a/buildtools/wafsamba/wscript
+++ b/buildtools/wafsamba/wscript
@@ -211,6 +211,8 @@ def configure(conf):
 
     conf.check_tool('compiler_cc')
 
+    conf.env['RPATH_ST'] = '-Wl,-R,%s'
+
     # we need git for 'waf dist'
     conf.find_program('git', var='GIT')
 
2.)

 buildtools/wafsamba/wscript | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/buildtools/wafsamba/wscript b/buildtools/wafsamba/wscript
index 1a2cfe6..681d71f 100755
--- a/buildtools/wafsamba/wscript
+++ b/buildtools/wafsamba/wscript
@@ -211,6 +211,9 @@ def configure(conf):
 
     conf.check_tool('compiler_cc')
 
+    if not conf.CHECK_LDFLAGS(['-Wl,-rpath,.']) and conf.CHECK_LDFLAGS(['-Wl,-R,.'])
+        conf.env['RPATH_ST'] = '-Wl,-R,%s'
+
     # we need git for 'waf dist'
     conf.find_program('git', var='GIT')
 

The gnu linker also supports -R as alias for -rpath...
Comment 28 Jura Sasek 2014-12-18 12:43:30 UTC
Hi Ira,
You are (almost) completely right. There are at least 3 (known to me) issues:

 - this (Bug 10112)

 - problem with standalone "-" passed into the Solaris linker (Bug 10630)

 - "-R /usr/lib" (standard R-path) passed into the linker which is caused by the WAF's broken design. Standard path *must not* be part of R-path in Solaris because it is provided by the linker to be *ensured* this is the "last instance" to search the .so's.
Comment 29 Ralph Böhme 2014-12-18 17:26:17 UTC
Created attachment 10550 [details]
Patch that checks for correct linker flags for rpath

This one seems to do the trick on Solaris 10.

Funnily enough this a non issue on at least Solaris 11.1 where the Solaris linker ld suddenly supports -rpath:

$ man ld | grep -B 1 -A 8 rpath
Reformatting page.  Please Wait... done
     -R path
     -rpath path

         A colon-separated list of directories  used  to  specify
         library  search  directories  to  the runtime linker. If
         present and not NULL, the path is recorded in the output
         object  file  and passed to the runtime linker. Multiple
         instances of this option are concatenated together  with
         each path separated by a colon. See Directories Searched
         by the Runtime Linker in Linker and Libraries Guide.
Comment 30 Stefan Metzmacher 2014-12-23 22:28:04 UTC
Created attachment 10563 [details]
Patch for v4-2-test (part1)
Comment 31 Stefan Metzmacher 2014-12-24 09:41:51 UTC
Created attachment 10564 [details]
Patches for v4-2-test (part1)
Comment 32 Stefan Metzmacher 2015-01-18 14:50:34 UTC
Comment on attachment 10564 [details]
Patches for v4-2-test (part1)

These patches are incomplete and produce the problems reported here:
https://bugzilla.samba.org/show_bug.cgi?id=9299#c31

I'll post additional patches for master to the mailing list.
Comment 33 Tom Schulz 2015-03-11 15:54:10 UTC
This problem is fixed in the 4.2.0 release.
Should this bug be closed?
Comment 34 Stefan Metzmacher 2015-03-13 08:30:52 UTC
Fixed in bug 11073