EC2 Instances – Production Monitoring Best Practices

amazon ec2monitoring

I'm setting up my first production instance on EC2 and want to make sure I have all necessary monitoring in place. There are three different types of things I want to monitor:

  1. Is the instance running? EC2 instances can be terminated without warning if the underlying hardware fails, and as far as I know they aren't automatically restarted. So if not, start it back up.

  2. Is UNIX running properly? This is the usual stuff about CPU load, disk space, etc.

  3. Is the website responding? If not, restart it.

I initially set up Nagios on a physical server outside the cloud, but it is really only helpful for item 2. It can tell me if the instance is gone or if the website is not responding, but as far as I can tell it can't execute any commands to fix the situation.

My Googling on this subject has yielded a plethora of options – Cacti, Monit, God, Ganglia, and probably more I'm forgetting now. I don't have time to research them all. I am aware of Amazon's Cloudwatch but it doesn't seem to do anything that my Nagios installation doesn't already do.

If you already have something like this in place, can you please share what has worked well for you?

Best Answer

Monit should do most of what you need. If you want something a bit more advanced but more specifically tailored to EC2, have a look at the services offered by RightScale or Scalr (an open source competitor to RightScale).

Related Topic