Elastichsearch node health check for haproxy

elasticsearchhaproxy

I have place haproxy in front of a three node ES(elasticsearch) cluster. So far the way i check for each node in haproxy is by using httpcheck. Bellow is a snippet of my config:

backend elastic_nodes
balance roundrobin
option forwardfor
option httpchk
http-check expect status 200
server elastic1 10.88.0.101:9200 check port 9200  fall 3 rise 3
server elastic2 10.88.0.102:9200 check port 9200  fall 3 rise 3
server elastic3 10.88.0.103:9200 check port 9200  fall 3 rise 3

So far this check works fine but if the cluster turns red the response code still is "200" (this is correct since http-wise the node is accessible) which will make haproxy consider the backend server healthy.

On the other side, if i check the status of the cluster and marking a node as down upon receiving health status "Red", this will mark all backend servers as down thus disabling the ES service. My problem on this approach is that in the past indeed my cluster was Red but it was still usable since there was just a single shard missing (a days log). In other words, there are cases where ES Red status is not a big issue and you want to still serve ES requests (instead of marking all backend nodes down with haproxy this blocking ES service).

Is there any other approach to this?

Best Answer

We use HAproxy to balance between two redundant clusters. During normal operation each receives ~50% of traffic; each is provisioned to take 100% when necessary.

We experienced a fault recently based on a failure case we had not planned for: all client and master nodes stayed up, so our cluster was responsive to REST–but all data nodes were temporarily offline, all indices appeared red and empty, and queries against them returned 0 results. But with a 200, following REST convention.

Our simple HAproxy health check failed us in this case; it merely checked for 200s.

I am now investigating use of http-check expect ! string red with a URI that targets the index of interest directly. I haven't used the more advanced http-check features before.

A more expensive check, but, should correctly take the client nodes for a lobotomized cluster out of the pool.

UPDATE (2): I have switched us over to using

option httpchk get /_cat/indices/<index of interest>
http-check expect rstring \b(green|yellow)\b

and it indeed seems like a better test.

(Second revision: using explicit check for green or yellow instead of just not-red, belatedly thought about index entirely missing from _cat fiter..._

Related Solutions

HAProxy authenticated httpchk (health check)

No, but you can use

http-check expect

Which lets you define what the response should look like.

For example:

http-check expect ! string Foo\ Bar

will only succeed if you have that string on the page.

I can't link directly in the docs, but load up http://haproxy.1wt.eu/download/1.5/doc/configuration.txt and search for "http-check expect" for more info.

Iis – HAProxy check does not check content on IIS

Your config is perfect, except for one little thing. You haven't specified that the health check should actually fetch the page it's requesting, and as a result the contents will be blank.

According to the HAProxy docs, option httpchk by default uses the OPTIONS method, which doesn't get the body of the page.

option httpchk <uri>
option httpchk <method> <uri>
option httpchk <method> <uri> <version>

Enable HTTP protocol to check on the servers health

<method>
is the optional HTTP method used with the requests. When not set, the "OPTIONS" method is used, as it generally requires low server processing and is easy to filter out from the logs. Any method may be used, though it is not recommended to invent non-standard ones.

<uri>
is the URI referenced in the HTTP requests. It defaults to " / " which is accessible by default on almost any server, but may be changed to any other URI. Query strings are permitted.

<version>
is the optional HTTP version string. It defaults to "HTTP/1.0" but some servers might behave incorrectly in HTTP 1.0, so turning it to HTTP/1.1 may sometimes help. Note that the Host field is mandatory in HTTP/1.1, and as a trick, it is possible to pass it after "\r\n" following the version string.

You have a couple of options to solve this.

You can change your method to GET:

backend mt-http
  balance roundrobin
  mode http
  option httpchk GET /check.aspx?appserver=dev-cluster.xxxx.com&databaseserver=test.xxxx.com&database=######dev
  http-check expect string 200\ OK
  server  WebLB-test2 xx.xx.xx.xx:80 check
  server  WebLB-test1 xx.xx.xx.xx:80 check

I've tried this on my test load-balancer and it works as expected.

You can change check.aspx to emit something other than a 200 OK HTTP status when the checks fail.
I had once had a need to check whether a particular service was running, because IIS would always return 200 OK even if the actual application was down on a backend. So I wrote a simple C# script to do just that:

<%@ Page Language="C#"%>
<%@ Import Namespace="System" %>
<%@ Import Namespace="System.ServiceProcess" %>
<%@ Import Namespace="System.Net" %>
<%  

ServiceController sc = new ServiceController(Request.QueryString["service"]);

switch (sc.Status)
{
  case ServiceControllerStatus.Running:
    Response.StatusCode = (int)HttpStatusCode.OK;
    Response.ContentType = "text/plain";
    Response.Write("Running");
    break;

  default:
    Response.StatusCode = (int)HttpStatusCode.ServiceUnavailable;
    Response.ContentType = "text/plain";
    Response.Write("Failed");
    break;
}
%>

You'd use it thusly:

backend FooBar
  balance roundrobin
  mode http
  option httpchk GET /ServiceCheck/check.aspx?service=ServiceName
  http-check expect status 200
  server  WebLB-test2 xx.xx.xx.xx:80 check
  server  WebLB-test1 xx.xx.xx.xx:80 check

Best Answer

Related Solutions

HAProxy authenticated httpchk (health check)

Iis – HAProxy check does not check content on IIS

Related Topic