F5 BigIP ssh monitor

I created a pool of load balanced ssh servers. In order to monitor them for availability, I needed to create a custom monitor. This is a very old F5 load balancer:

# uname -r
BIG-IP 4.5.14

It seems like the easiest way to monitor ssh servers would be with ssh. After tinkering with it, I didn’t like the idea. I didn’t like the interactive quality of ssh, and didn’t want to make a custom user just for health checks. I also, didn’t want to put shared keys on the load balancers themselves. These are very custom ssh servers, and trigger filesystem mounting, and all sorts of other auth methods. I don’t actually want to ssh to them, I just want to see if ssh port is open. expect or nc or nmap were not available. I hit tab a few times and viewed the 500 or so commands available. I saw curl and gave that a try.

For our purposes, all we care about is that the port is open and we get a response to a request on that port:

#!/bin/sh
node_ip=`echo $1 | sed 's/::ffff://'`

pidfile="/var/run/`basename $0`.$node_ip..$2.pid"
if [ -f $pidfile ]
then
   kill -9 `cat $pidfile` > /dev/null 2>&1
fi
echo "$$" > $pidfile

curl http://${node_ip}:22 --connect-timeout 5 > /dev/null 2>&1

status=$?
if [ $status -eq 0 ]
then
    echo "UP"
fi

rm -f $pidfile

If the server is down completely, curl returns 7. If sshd has crashed the port is closed and curl again returns 7. If the port is open, curl exits 0.

One thought on “F5 BigIP ssh monitor

  1. After running this monitor for a while, some issues have crept up. If you have the default EAV monitor interval and timeout values set to 5 and 16 respectively, there is a chance that the monitor will step on top of itself. It can kill the parent process, but leave the child curl process orphaned. This builds up curl processes, chews up memory, and eventually causes the entire load balancer to stop functioning.

    I would remove the kill -9, and replace with an exit. Also, lower the curl timeout value to 3 seconds, so that the script never takes longer than 3 seconds (and change) to complete. The alternative would be to up the interval and timeout values to 7 and 22.

    In the end I think I will just go with a custom monitor defined with the default TCP base.

Leave a Reply

Your email address will not be published. Required fields are marked *