ctdbd monitors the status of the samba daemon by checking for sockets bound to samba ports (445 & 139 by default) in the LISTEN state. It does this by parsing the output of netstat -a -t -n. config/functions: 178 ctdb_check_tcp_ports() { 179 180 for p ; do 181 if ! netstat -a -t -n | grep -q "0\.0\.0\.0:$p .*LISTEN" ; then 182 if ! netstat -a -t -n | grep -q ":::$p .*LISTEN" ; then 183 echo "ERROR: $service_name tcp port $p is not responding" 184 return 1 185 fi 186 fi 187 done 188 } Under an intensive connect-disconnect workload, the number of sockets in the TIME_WAIT state can easily reach several thousand, as a result the netstat command takes a long time (~5s x 4 runs = ~20s) to complete and is CPU intensive. There are a few options to make this check more efficient. Removing unnecessary re-runs and using --listening rather than -a to request only sockets in the LISTEN state would be a step in the right direction.
Created attachment 6498 [details] 1.0.112 based patch
This issue has been addressed in the ctdb 1.2 branch. If no further merges are required then this bug can be closed. commit f9f28ff32c3d110b2609a277aa6f71211e3eb7b6 Author: Martin Schwenke <martin@meltin.net> Date: Tue Jul 5 11:32:06 2011 +1000 Eventscript functions: optimise ctdb_check_tcp_ports() and add debug. ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each port. There are 2 problems with this: * Netstat is run on each loop iteration when it need only be run once. * The -a option is used to list all connections but the function only cares about the listening ports. There may be many thousands of non-listening ports to grep through. This changes ctdb_check_tcp_ports() to run netstat with the -l option instead of the -a option. It also only runs netstat once before the main loop. When a port is found to not be listening the output of the netstat command is now dumped to help with debugging.
Closing as per comment#2.