Bug 4554 - SIGALRM lost when using Heimdal GSSAPI
SIGALRM lost when using Heimdal GSSAPI
Status: NEW
Product: Samba 3.0
Classification: Unclassified
Component: File Services
3.0.24
Sparc Solaris
: P3 normal
: none
Assigned To: Samba Bugzilla Account
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-25 23:26 UTC by David Leonard (550 5.7.1 Unable to deliver)
Modified: 2007-04-25 23:26 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Leonard (550 5.7.1 Unable to deliver) 2007-04-25 23:26:07 UTC
When placed under heavy load for a long time, smbd procs will sometimes hang, consuming 100% CPU time
indefinitely.

This was samba 3.0.24 linked against heimdal 0.72 on solaris 8 sparc, all compiled with -m64.

When attached to using a debugger, I got

# adb /opt/quest/sbin/smbd
0t9270:A
process 9270 stopped at:
DES_rand_data+0x24c:            add     %g1, 0x1, %g1
$c
DES_rand_data(ffffffff7fffb980,8,ffffffff7fffc0b0,ffffffff7fffc0b0,ffffffff7fffc0a8,ffffffff7fffc098) + 24c
DES_generate_random_block(ffffffff7fffb980,0,ffffffff7e0c3640,ffffffff7df49c98,0,0) + 10
DES_mem_rand8(0,0,ffffffff7e0b6f60,0,ffffffff7e0c3668,ffffffff7e0c3768) + 24
DES_new_random_key(ffffffff7fffbb40,10081c118,ffffffff7f320000,ffffffff7df49c98,0,0) + 30
krb5_generate_random_block(100824070,8,ffffffffffffffff,ffffffff7df49c98,0,100824071) + 54
krb5_enctypes_compatible_keys(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + 968
krb5_encrypt_ivec(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + c0
krb5_encrypt(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + 44
krb5_mk_rep(100833830,1008281f0,ffffffff7fffc0b0,ffffffff7fffc0b0,ffffffff7fffc0a8,ffffffff7fffc098) + 3ec
ads_verify_ticket(100874940,100844500,0,ffffffff7fffda60,ffffffff7fffda58,ffffffff7fffd820) + c6c
reply_spnego_kerberos(0,1008ad490,1008cd8f0,1038,20000,ffffffff7fffdc00) + 320
reply_spnego_negotiate(0,1008ad490,1008cd8f0,64,1038,20000) + 454
reply_sesssetup_and_X_spnego(0,1008ad490,1008cd8f0,1038,20000,ffffffff7fffed69) + b3c
reply_sesssetup_and_X(0,1008ad490,1008cd8f0,1038,20000,10081dd88) + 454
switch_message(73,1008ad490,1008cd8f0,1038,20000,ffffffff7fffd070) + c60
construct_reply(1008ad490,1008cd8f0,1038,20000,ffffffff7fffd050,ffffffff7fffd070) + e0
process_smb(1008ad490,1008cd8f0,1008cd8f0,20441,0,100808470) + 508
smbd_process(bba,1000926d8,0,1,3,1) + 2c4
main(2,ffffffff7ffff518,ffffffff7ffff530,10081e5c0,100000000,0) + 1564

This turns out to be stuck in Heimdal's rnd_keys.c:503, which is waiting for a SIGALRM to
stop a fast-increment loop in a pseudo-random number generator. (Presumably getting entropy from the scheduler). The workaround was to install /dev/random on the host, as it looks for and uses a kernel random source if there is one. (specifically installed Solaris OS patch 112438)

Because the Heimdal code uses setitimer() and libads uses alarm(), my guess is that somehow the SIGALRM is just getting lost. I'm not sure. Not easy to reproduce because of seemingly random nature.

Here is the (redacted) smb.conf

[global]
   workgroup = AAAAAAAAAA
   server string = BBBBBBBBBBBBBBBBBBBBBB
   log file = /var/opt/quest/log/samba/%m.log
   log level = 1
   max log size = 1000
   security = ads
   realm    = CCCCCCCCCCC.COM
   use spnego = yes
   use kerberos keytab = yes
   machine password timeout = 0
;   max smbd processes = 1000 
   encrypt passwords = yes
   domain logons = no
   domain master = no
   preferred master = no
   local master = no
   password server = DDDDDDDDDDDDDDDDDDDDDDD.COM
   username map    = /etc/opt/quest/samba/user.map
   socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192

[hhhhhhhh]
        comment         = EEEEEEEEEEEEE
        browseable      = yes
        valid users     = +ffffffff +ggggggg
        writable        = yes
        public          = no
        ; Rational recommends the following parameters
        case sensitive          = no
        preserve case           = yes
        oplocks                 = no
        level2 oplocks          = no
 
        force create mode       = 0664
        force directory mode    = 0775
        path            = /hhhhhhhh