In this guide I'm not focusing at all on setting up HA proxy or why you'd want to do that. If you have it and want to properly monitor it using Icinga, here's an idea of how you could go about this.
So here is a potential scenario:
- 2 data centers A and B
- 1 HA Proxy node per data center
- Each HA proxy points to 2 web servers in each data center A1, A2, B1, B2
- The web servers in this scenario are really a Web Service endpoint and a simple HTTP GET to a URL doesn't tell you that much about the actual health of the system
Monitoring wise you could settle for an external check (like pingdom or whatever) of your currently active nodes. That would have some implications though:
- You would not be testing passive nodes which means before a node switch you're not really sure if the passive nodes are working
- A failure of one node will not give you a clear indication of what is wrong
So here is a paranoid persons approach:
- I want to monitor each node all the way through from the external IP(s), through HA proxy and into the system to catch any glitch along the way
- I want to make an actual Web Service call to the back end service to verify that it's working – obviously not applicable if you're testing a normal web site
Lets get to it then…
Best Answer
First of all you'll need to enable cookie inserting in haproxy and assign each back end node its unique key. This is usually used for for session stickiness - i.e. you wan't someone visiting your site to always get the same back end node if it's still available. But it can also be used to monitor individual nodes by sending the appropriate cookie. So if not present add cookies to your haproxy server definitions:
Secondly you will need to figure out what makes the most sense to check, on this you'll need to do some thinking and fiddling on your own to figure out what makes the most sense and how to check that using nagios's awesome check_http. For completeness I'll give a complex example below of how you could test a POST toward a back-end Web Service. For this example scenario the requirements are:
This would be taken care of by the following arguments to check_http (/usr/lib64/nagios/plugins/check_http on Cent OS 6)
Now, this all put together should give you a nice OK output, get this working first.
Then it's time for some custom aspects enabling node selection through the cookie and also optionally sending in of an IP you can use to override DNS in case you for example want to check a path through a passive data center. To do this we'll write a small shell script wrapper around check_http that will take one parameter as the host-name of the back end node (for convenience, lets use what icinga considers the host name to be) and an optional parameter overriding the IP of the server to check (bypassing DNS lookup). This all results in a shell script looking something like this (I suggest putting it in /usr/lib64/nagios/plugins/ and chown,chmod it as per the other plugins in there):
Note that SERVERID is the name of the cookie set in haproxy.
Once this is in place you can define your nagios check commands similar to:
Where check_node is the name of the wrapper script and 'A external IP' is the IP used to reach the system in data center A.
This would have saved me a lot of time the last few days so I hope it can send you in the right direction too.