When placed under heavy load for a long time, smbd procs will sometimes hang, consuming 100% CPU time indefinitely. This was samba 3.0.24 linked against heimdal 0.72 on solaris 8 sparc, all compiled with -m64. When attached to using a debugger, I got # adb /opt/quest/sbin/smbd 0t9270:A process 9270 stopped at: DES_rand_data+0x24c: add %g1, 0x1, %g1 $c DES_rand_data(ffffffff7fffb980,8,ffffffff7fffc0b0,ffffffff7fffc0b0,ffffffff7fffc0a8,ffffffff7fffc098) + 24c DES_generate_random_block(ffffffff7fffb980,0,ffffffff7e0c3640,ffffffff7df49c98,0,0) + 10 DES_mem_rand8(0,0,ffffffff7e0b6f60,0,ffffffff7e0c3668,ffffffff7e0c3768) + 24 DES_new_random_key(ffffffff7fffbb40,10081c118,ffffffff7f320000,ffffffff7df49c98,0,0) + 30 krb5_generate_random_block(100824070,8,ffffffffffffffff,ffffffff7df49c98,0,100824071) + 54 krb5_enctypes_compatible_keys(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + 968 krb5_encrypt_ivec(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + c0 krb5_encrypt(100833830,100822890,c,1008373f0,1e,ffffffff7fffbf38) + 44 krb5_mk_rep(100833830,1008281f0,ffffffff7fffc0b0,ffffffff7fffc0b0,ffffffff7fffc0a8,ffffffff7fffc098) + 3ec ads_verify_ticket(100874940,100844500,0,ffffffff7fffda60,ffffffff7fffda58,ffffffff7fffd820) + c6c reply_spnego_kerberos(0,1008ad490,1008cd8f0,1038,20000,ffffffff7fffdc00) + 320 reply_spnego_negotiate(0,1008ad490,1008cd8f0,64,1038,20000) + 454 reply_sesssetup_and_X_spnego(0,1008ad490,1008cd8f0,1038,20000,ffffffff7fffed69) + b3c reply_sesssetup_and_X(0,1008ad490,1008cd8f0,1038,20000,10081dd88) + 454 switch_message(73,1008ad490,1008cd8f0,1038,20000,ffffffff7fffd070) + c60 construct_reply(1008ad490,1008cd8f0,1038,20000,ffffffff7fffd050,ffffffff7fffd070) + e0 process_smb(1008ad490,1008cd8f0,1008cd8f0,20441,0,100808470) + 508 smbd_process(bba,1000926d8,0,1,3,1) + 2c4 main(2,ffffffff7ffff518,ffffffff7ffff530,10081e5c0,100000000,0) + 1564 This turns out to be stuck in Heimdal's rnd_keys.c:503, which is waiting for a SIGALRM to stop a fast-increment loop in a pseudo-random number generator. (Presumably getting entropy from the scheduler). The workaround was to install /dev/random on the host, as it looks for and uses a kernel random source if there is one. (specifically installed Solaris OS patch 112438) Because the Heimdal code uses setitimer() and libads uses alarm(), my guess is that somehow the SIGALRM is just getting lost. I'm not sure. Not easy to reproduce because of seemingly random nature. Here is the (redacted) smb.conf [global] workgroup = AAAAAAAAAA server string = BBBBBBBBBBBBBBBBBBBBBB log file = /var/opt/quest/log/samba/%m.log log level = 1 max log size = 1000 security = ads realm = CCCCCCCCCCC.COM use spnego = yes use kerberos keytab = yes machine password timeout = 0 ; max smbd processes = 1000 encrypt passwords = yes domain logons = no domain master = no preferred master = no local master = no password server = DDDDDDDDDDDDDDDDDDDDDDD.COM username map = /etc/opt/quest/samba/user.map socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 [hhhhhhhh] comment = EEEEEEEEEEEEE browseable = yes valid users = +ffffffff +ggggggg writable = yes public = no ; Rational recommends the following parameters case sensitive = no preserve case = yes oplocks = no level2 oplocks = no force create mode = 0664 force directory mode = 0775 path = /hhhhhhhh
please check out a recent and supported samba version with the included heimdal version. If you see issued with that, please file a new bug report for that.