NGINX load balancing using ip_hash directive

load balancingnginx

I have a simple two node server cluster, running on localhost:8001 and localhost:8002, load-balanced using NGINX. Below is the http context of my nginx.conf.

http {
    include       mime.types;
    default_type  application/octet-stream;

    upstream backend {
        ip_hash;
        server localhost:8001;
        server localhost:8002;
    }

    log_format upstreamlog 'upstream: $upstream_addr: $request upstream-response-status: $upstream_status';

    server {
        listen              80;
        listen              [::]:80;
        server_name         localhost;
        access_log  logs/access.log  upstreamlog;

        location / {
            proxy_pass http://backend/;
        }
    }
}

Initially all requests to http://localhost/ were redirected to upstream server running at port 8001.

–Logs

 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 ----
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: 200
 upstream: [::1]:8001: GET /favicon.ico HTTP/1.1 upstream-response-status: 200

Now for testing the fail-over of this setup, I stopped the server running at port 8001. But the fail-over did not work and all subsequent requests were also forwarded to the server at port 8001.

–Logs

 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 ----
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 ----
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001: GET / HTTP/1.1 upstream-response-status: -
 upstream: [::1]:8001, 127.0.0.1:8001, [::1]:8002: GET / HTTP/1.1 upstream-response-status: 504, 504, 200

NGINX took a long time, approximately 3 minutes, for switching over to the other node at port 8002. What is that I am missing in the configuration? I know that default max_fails is 1 and fail_timeout is 10 seconds. How to make NGINX switch-over to other server node with zero downtime?

(NOTE: ip_hash had to be used for session affinity and other purposes)

Best Answer

I think you need to add proxy_next_upstream directive in location block. This directive function is to specify in which cases a request should be passed to the next server. Then add http_503, because when you stop the instance it will throw 503 or service unavailable. If your problem is because timeout you can change the proxy_connect_timeout and proxy_read_timeout. Example configuration

    location / {
        proxy_pass http://backend/;
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
        #timeout for 10 second
        proxy_connect_timeout 10;
        proxy_read_timeout 10;
    }

Here is the documentation for all proxy directive is in http://nginx.org/en/docs/http/ngx_http_proxy_module.html

Related Topic