monit is a very cool system for keeping your linux servers working – highly recommended. With a few lines of configuration, you can have it check any aspect of your system and services and when problems occur, have it alert or take remedial actions (like restarting services, cleaning up log or temporary files etc).
For example, we had the problem that we are running clustered web apps (using Terracotta) in VMware VMs. The cluster nodes were being suspended regularly for backups and this caused them to be evicted from the cluster. A simple solution was to use monit to monitor the apps (via the same health check port we were using for the availability check for the HAProxy load balancer) and to restart the services if the health checkĀ fails (as happens after the VM is unsuspended after the backup).
Here’s the line from the monitrc file we use to monitor the health check port (9000 in this case):