Hi team, I have a number of Samba servers that perform "net ads testjoin -P" on a schedule. This net process sometimes gets stuck in endless loop, using 100% of the CPU. This only started to happen with upgrade to Samba 3.6.13 or later, but I cannot tell when exactly. It's happening now with 3.6.22. Here is a stack trace of a running process from gdb: (gdb) bt #0 0x00007fff8e1ff771 in gettimeofday () #1 0x00007f4589cf532b in ?? () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #2 0x00007f4589cfa7f2 in ?? () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #3 0x00007f4589cfb094 in ?? () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #4 0x00007f4589cfb50c in krb5_sendto_kdc () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #5 0x00007f4589cd1565 in krb5_tkt_creds_get () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #6 0x00007f4589cd16cd in krb5_get_credentials () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #7 0x00007f458b20e595 in smb_krb5_get_credentials (context=0x7f458ca623d0, ccache=<optimized out>, me=<optimized out>, server=<optimized out>, impersonate_princ=<optimized out>, out_creds=0x7fff8e06b048) at libsmb/clikrb5.c:2020 #8 0x00007f458b20eb1d in ads_krb5_mk_req (impersonate_princ_s=0x0, expire_time=0x7f458ca36ac8, outbuf=0x7fff8e06b000, ccache=0x7f458ca6e330, principal=0x7f458ca62360 "ldap/dc-3.ad.example.com@AD.EXAMPLE.COM", ap_req_options=1, auth_context=0x7fff8e06b030, context=0x7f458ca623d0) at libsmb/clikrb5.c:704 #9 cli_krb5_get_ticket (mem_ctx=0x7f458ca31110, principal=0x7f458ca62360 "ldap/dc-3.ad.example@AD.EXAMPLE.COM", time_offset=<optimized out>, ticket=0x7fff8e06b0e0, session_key_krb5=0x7fff8e06b170, extra_ap_opts=0, ccname=0x7f458ca63060 "MEMORY:net_ads", tgs_expire=0x7f458ca36ac8, impersonate_princ_s=0x0) at libsmb/clikrb5.c:904 #10 0x00007f458b210084 in spnego_gen_krb5_negTokenInit (ctx=0x7f458ca31110, principal=<optimized out>, time_offset=<optimized out>, targ=0x7fff8e06b150, session_key_krb5=<optimized out>, extra_ap_opts=<optimized out>, expire_time=0x7f458ca36ac8) at libsmb/clispnego.c:311 #11 0x00007f458b52c2c1 in ads_sasl_spnego_rawkrb5_bind (ads=0x7f458ca36a70, principal=<optimized out>) at libads/sasl.c:783 #12 0x00007f458b52ce3f in ads_sasl_spnego_krb5_bind (p=0x7fff8e06b2a0, ads=0x7f458ca36a70) at libads/sasl.c:823 #13 ads_sasl_spnego_krb5_bind (p=0x7fff8e06b2a0, ads=0x7f458ca36a70) at libads/sasl.c:1233 #14 ads_sasl_spnego_bind (ads=0x7f458ca36a70) at libads/sasl.c:904 #15 0x00007f458b52d50a in ads_sasl_bind (ads=0x7f458ca36a70) at libads/sasl.c:1213 #16 0x00007f458b5282c0 in ads_connect (ads=0x7f458ca36a70) at libads/ldap.c:730 #17 0x00007f458b0f7ff3 in ads_startup_int (c=0x7f458ca31170, only_own_domain=true, auth_flags=0, ads_ret=0x7fff8e06b7d8) at utils/net_ads.c:292 #18 0x00007f458b0f8ca2 in ads_startup (c=<optimized out>, only_own_domain=<optimized out>, ads=<optimized out>) at utils/net_ads.c:339 #19 0x00007f458b0faa5f in net_ads_join_ok (c=0x7f458ca31170) at utils/net_ads.c:1056 #20 0x00007f458b0fab0a in net_ads_testjoin (c=0x7f458ca31170, argc=<optimized out>, argv=<optimized out>) at utils/net_ads.c:1083 #21 0x00007f458b0fd114 in net_ads (c=<optimized out>, argc=<optimized out>, argv=<optimized out>) at utils/net_ads.c:2812 #22 0x00007f458b0f636b in main (argc=<optimized out>, argv=<optimized out>) at utils/net.c:939 (gdb) detach Core dump of the process is attached as well.
Created attachment 9649 [details] net-samba3.core
Hi Alex, thanks for the report. Which version of Kerberos are you using? We previously hit a bug in MIT krb5-1.6.3, where the krb5int_sendto() code path dropped into an endless unhandled event loop during intermittent network outage. It should be observable via an strace of the stuck net process.
Created attachment 10596 [details] Patch queued for SLES11 krb5-1.6.3 fixing the unhandled event loop
Oh, forgot about this bug. I currently use the one that comes with Ubuntu 12.04, I believe it is 1.10: ii libkrb5-3 1.10+dfsg~beta1-2ubuntu0.5 However, since upgrade to Samba 4.1 this bug doesn't occur anymore, it can be resolved now.
(In reply to Alex K from comment #4) Thanks, closing...