Home Random Sh17cast Forums

Using Monit to keep your services online

Tuesday, December 1, 2009

Thesh17’s been running on a linode VPS for a month or so now.   It works great until it gets railed, and apache dies.     Then a few hours later I notice the site’s down and login to restart it.    Originally I wrote a shell script that would just check if httpd was running.  If it wasn’t, it’d start it.   It’d also check the output of ‘free -m’ and if memory usage(not counting buffers) was greater than 90%, restart apache.

You’d think that’d be good, right?  Wrong.   I don’t know why, but occasionally things would get confused.  It’d see httpd running, but it wasn’t serving any pages on port 80.    service httpd status would show it was stopped, but the script didn’t check http connectivity or the actual pid file. I honestly didn’t troubleshoot this much, and just manually killed the processes/restarted httpd.

I decided to search for some scripts similar to mine(and better), but instead ended up landing on a utility called Monit.   It’s basically a service monitor you can run(locally or remotely) and depending on different conditions, it will restart the service and make sure it’s running.     This can be as simple as the pid in the pid file not existing, or more complex like the process has been using 90% of the cpu for the last 30 minutes.

As an example, here’s what my apache monit config looks like right now:

check process httpd with pidfile /var/run/httpd.pid
group apache
start program = "/sbin/service httpd start"
stop program = "/sbin/service httpd stop;pkill -9 httpd"
if failed host port 80 protocol http then restart
if cpu is greater than 60% for 2 cycles then alert
if cpu > 80% for 5 cycles then restart
if children > 150 then restart
if loadavg(5min) greater than 10 for 8 cycles then alert
if memory > 90% for 2 cycles then restart
if 5 restarts within 5 cycles then timeout

Monit is pretty simple to install. It’s available in most distro’s repositories. On centos:

root@server:# yum install monit

On Gentoo:

root@server:# echo "=app-admin/monit-5.0.3" >> /etc/portage/package.keywords;
root@server:# emerge -av monit

I’m going to give Monit a try for a few weeks and see how it works out. I’ve been using nagios lately as well, but it seems kind of cumbersome to me and mainly one used for monitoring/reporting. Monit will actually restart a service if it detects a problem which *usually* is a good thing.

Discover and Share
posted by johntash at 4:25 pm  

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

23 queries. 0.104 seconds.
Copyright © 2007-2010 http://www.thesh17.com