The Samba-Bugzilla – Bug 7654
Crash in krb5 code
Last modified: 2010-12-09 04:40:47 UTC
I have the latest samba4 git checkout at revision 44b2a79, and also installed krb5 1.7.1, but with no kerberos server set. When I try to login to an exchange server through the openchange it crashes, as shown in the below stracktrace.
The reason is krb5_appdefault_string of krb5 1.7.1, because when it doesn't find the value, it blindly strdup the default value passed in, which in this case is NULL. Thus I believe you should never pass NULL as a default value to the krb5_appdefault_string function.
#0 __strlen_ia32 () at ../sysdeps/i386/i586/strlen.S:99
#1 0x07882515 in __strdup (s=<value optimized out>) at strdup.c:42
#2 0x0072b7b3 in krb5_appdefault_string () from /lib/libkrb5.so.3
#3 0x02c0677a in krb5_appdefault_time (context=0xb509f020, appname=0x0, realm=0x0, option=0x2ce124c "ticket_lifetime", def_val=0, ret_val=0xb45fd3d8)
#4 0x02c22518 in krb5_get_init_creds_opt_set_default_flags (context=0xb509f020, appname=0x0, realm=0x0, opt=0xb509b480)
#5 0x02b7e846 in kerberos_kinit_password_cc (ctx=0xb509f020, cc=0xb509b470, principal=0xb50a2a98, password=0xb509d020 "xxx",
impersonate_principal=0x0, target_service=0x0, expire_time=0x0, kdc_time=0xb45fd4c8) at ../auth/kerberos/kerberos.c:111
#6 0x02b877ec in kinit_to_ccache (parent_ctx=0xb509b268, credentials=0xb509b268, smb_krb5_context=0xb509e668, ccache=0xb509b470, obtained=0xb45fd538,
error_string=0xb45fd60c) at ../auth/kerberos/kerberos_util.c:228
#7 0x02b8626b in cli_credentials_get_named_ccache (cred=0xb509b268, event_ctx=0xb509b368, lp_ctx=0xb5009918, ccache_name=0x0, ccc=0xb45fd5b4,
error_string=0xb45fd60c) at ../auth/credentials/credentials_krb5.c:301
#8 0x02b86320 in cli_credentials_get_ccache (cred=0xb509b268, event_ctx=0xb509b368, lp_ctx=0xb5009918, ccc=0xb45fd5b4, error_string=0xb45fd60c)
#9 0x02b86563 in cli_credentials_get_client_gss_creds (cred=0xb509b268, event_ctx=0xb509b368, lp_ctx=0xb5009918, _gcc=0xb45fd610, error_string=0xb45fd60c)
#10 0x02b89b6d in gensec_gssapi_client_start (gensec_security=0xb50042c0) at ../auth/gensec/gensec_gssapi.c:379
#11 0x02bbf5a6 in gensec_start_mech (gensec_security=0xb50042c0) at ../auth/gensec/gensec.c:645
#12 0x02bbf8d6 in gensec_start_mech_by_ops (gensec_security=0xb50042c0, ops=0x2ced560) at ../auth/gensec/gensec.c:732
#13 0x02aeec0b in gensec_spnego_create_negTokenInit (gensec_security=0xb50b8dd0, spnego_state=0xb50b8d48, out_mem_ctx=0xb5005618, in=..., out=0xb500561c)
#14 0x02aef5b3 in gensec_spnego_update (gensec_security=0xb50b8dd0, out_mem_ctx=0xb5005618, in=..., out=0xb500561c) at ../auth/gensec/spnego.c:804
#15 0x02bc01e1 in gensec_update (gensec_security=0xb50b8dd0, out_mem_ctx=0xb5005618, in=..., out=0xb500561c) at ../auth/gensec/gensec.c:988
#16 0x0175c300 in dcerpc_bind_auth_send (mem_ctx=0xb500b850, p=0xb509d780, table=0x5e63880, credentials=0xb509b268, gensec_settings=0xb50a2d00,
auth_type=9 '\t', auth_level=2 '\002', service=0x5e40081 "host") at ../librpc/rpc/dcerpc_auth.c:325
#17 0x0175e5ed in dcerpc_pipe_auth_send (p=0xb509d780, binding=0xb509ba68, table=0x5e63880, credentials=0xb509b268, lp_ctx=0xb5009918)
#18 0x01763a3e in continue_pipe_connect (c=0xb5099ce0, s=0xb509dac0) at ../librpc/rpc/dcerpc_connect.c:684
#19 0x0176386a in continue_pipe_connect_ncacn_ip_tcp (ctx=0xb50a2a10) at ../librpc/rpc/dcerpc_connect.c:632
#20 0x02a7f40c in composite_done (ctx=0xb50a2a10) at ../libcli/composite/composite.c:144
#21 0x01762dbc in continue_pipe_open_ncacn_ip_tcp (ctx=0xb509e4b0) at ../librpc/rpc/dcerpc_connect.c:297
#22 0x02a7f40c in composite_done (ctx=0xb509e4b0) at ../libcli/composite/composite.c:144
#23 0x01761e29 in continue_ipv4_open_socket (ctx=0xb50042c0) at ../librpc/rpc/dcerpc_sock.c:452
#24 0x02a7f40c in composite_done (ctx=0xb50042c0) at ../libcli/composite/composite.c:144
#25 0x0176192f in continue_socket_connect (ctx=0xb500b850) at ../librpc/rpc/dcerpc_sock.c:302
#26 0x02a7f40c in composite_done (ctx=0xb500b850) at ../libcli/composite/composite.c:144
#27 0x02b2d357 in socket_connect_handler (ev=0xb509b368, fde=0xb500b598, flags=2, private_data=0xb500b850) at ../lib/socket/connect.c:131
#28 0x0506c3b6 in epoll_event_loop (std_ev=0xb509da38, tvalp=0xb45fdd94) at ../tevent_standard.c:309
#29 0x0506ca4d in std_event_loop_once (ev=0xb509b368, location=0x2c832a4 "../libcli/composite/composite.c:59") at ../tevent_standard.c:544
#30 0x05068ed1 in _tevent_loop_once (ev=0xb509b368, location=0x2c832a4 "../libcli/composite/composite.c:59") at ../tevent.c:494
#31 0x02a7f14b in composite_wait (c=0xb509db90) at ../libcli/composite/composite.c:59
#32 0x017641dd in dcerpc_pipe_connect_recv (c=0xb509db90, mem_ctx=0xb50075f8, pp=0xb45fdf3c) at ../librpc/rpc/dcerpc_connect.c:918
#33 0x017642bf in dcerpc_pipe_connect (parent_ctx=0xb50075f8, pp=0xb45fdf3c, binding=0xb509df48 "ncacn_ip_tcp:xxx.com",
table=0x5e63880, credentials=0xb509b268, ev=0xb509b368, lp_ctx=0xb5009918) at ../librpc/rpc/dcerpc_connect.c:943
#34 0x05d5f5c3 in provider_rpc_connection (parent_ctx=<value optimized out>, p=<value optimized out>, binding=<value optimized out>,
credentials=<value optimized out>, table=<value optimized out>, lp_ctx=<value optimized out>) at libmapi/IMSProvider.c:58
#35 0x05d5f8e8 in RfrGetNewDSA (session=<value optimized out>, server=<value optimized out>, userDN=<value optimized out>) at libmapi/IMSProvider.c:150
#36 0x05d5fa71 in Logon (session=<value optimized out>, provider=<value optimized out>, provider_id=<value optimized out>) at libmapi/IMSProvider.c:275
#37 0x05d634b0 in MapiLogonProvider (session=<value optimized out>, profname=<value optimized out>, password=<value optimized out>,
provider=<value optimized out>) at libmapi/cdo_mapi.c:176
#38 0x05d63762 in MapiLogonEx (session=<value optimized out>, profname=<value optimized out>, password=<value optimized out>) at libmapi/cdo_mapi.c:67
Is the problem related on s4's usage of Heimdal or is it an internal problem by Heimdal itself? Did you find this out?
Mathias the backtrace clearly shows that Heimdal code ends up calling into MIT code ... (krb5_appdefault_string () from /lib/libkrb5.so.3)
Simo: Well, but is this then necessarily our issue?
(In reply to comment #3)
> Simo: Well, but is this then necessarily our issue?
I think it is. Because it doesn't allow us use MIT in evolution.
Perhaps samba could have a configure option for selecting MIT or heimdal for internal usage.
(In reply to comment #4)
> Perhaps samba could have a configure option for selecting MIT or heimdal for
> internal usage.
I believe it might not be enough. If you want to use your own internal krb5 library (and there is no objection why not), then it should be either linked statically (to avoid dynamic symbol lookup errors like this) or the function should be prefixed somehow, to not clash with function from other dynamically linked library(-ies).
The thing is that samba4 is a library, and as such can be linked with other libraries and if everything is linked dynamically then there is always a chance that the symbol(s) will overlap between libraries. And this can differ between applications, once some part (like krb5) will work without issue in application A, and not with application B, because A doesn't link to MIT krb5, but B does.
Unfortunately the right fix might be in gcc/dynamic symbol lookup algorithm, but I understand this as sort-of known problem and nobody is trying to fix it completely. So one is left with proper prefixing of public function names to avoid ("random") conflicts between libraries.
Just my opinion.
krb5_appdefault_time() calls krb5_appdefault_string() which is defined right above it, in the same source file, and yet the linker prefers another library?
That's lame. But I still think the samba ppl should be concerned.
And somehow this didn't happen in earlier releases, such as alpha10 or alpha11
which notably I built with autotools.
So how can we work around the problem now? Can we link statically heimdal in samba4 as it is, so that it doesn't call MIT code? This waf build system is so alien to me, I wouldn't know where to begin.
Abartlet, could you comment on this?
This looks like a serious issue. We increasingly use shared libraries internally for the bulk of our code, but how can we ensure we bind more tightly to our own functions than the functions in the system libkrb5, without macros redefining every function?
Anyway, I'll ask tridge (our build system maintainer) if he has any clues.
In the meantime, if you could try again with the latest GIT tree, it will help us know if any of the reworking of the build system done recently has helped.
Could you please retry in order to know if the situation has improved?
I tried again with git from 2010-12-03 and although I see changes like some
shared libs created in $prefix/lib/samba, $prefix/modules, the problem
Btw I also tried to build with --builtin-libraries=krb5 and it didn't make any
difference. I still get a shared libkrb5 and the same segfault. Is that
The issue is that Evolution needs to load the MAPI plugin with dlopen()'s RTLD_DEEPBIND flag.
This will then mean that OpenChange and therefore Samba will use it's own libraries first, and never the system library.
We are also looking at what we can do to use symbol versions to help with this.
Created attachment 6126 [details]
Well, the problem might be that the plugin is also a library - it's actually a set of libraries. Also, I do not know how to get exact file names for libraries, but that's not the issue as the approach you suggested doesn't work, unfortunately.
I see I'm preloading libmapi.so.0, libdcerpc.so.0, libtevent.so.0, libndr.so.0, libsamba-hostconfig.so.0, libtalloc.so.0 with no error returned, but it still crashes as above. I suppose it's because these libraries are already opened, thus my own dlopen call doesn't work, though it's placed just before the first call from libmapi.
This can be reproduced with /etc/krb5.conf being the example one from krb5-libs rpm (on Fedora).
A proper fix (using symbol versions on supporting platforms) has been committed to the tree, and will be in alpha14.
I wasn't expecting that the MAPI connector would actually be built into Evolution, and so had assumed it was loaded with some plugin system at some point (and so it would have been an easy fix to change the dlopen()).
Sorry for the time this has taken, and that there isn't a simple/quick fix.
No problem. What is the estimated time of alpha14 release, please? Because it would be good to have fixed also bug #7519, because it breaks translations, apart of other things.