Nginx – Temporary Nginx 504 upstream timed out error for specific connections only

I am finding this sporadic this error in the nginx error log (log level: error):

2018/05/01 22:19:24 [error] 27520#27520: *753839613 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 77.85.205.153, server: *.mydomain.com, request: "GET / HTTP/1.1", upstream: "http://192.168.101.52:80/", host: "www2.mydomain.com" 2018/05/01 22:20:24 [error] 27520#27520: *753839613 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 77.85.205.153, server: *.mydomain.com, request: "GET / HTTP/1.1", upstream: "http://192.168.101.53:80/", host: "www2.mydomain.com" 2018/05/01 22:21:24 [error] 27520#27520: *753839613 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 77.85.205.153, server: *.mydomain.com, request: "GET / HTTP/1.1", upstream: "http://192.168.101.51:80/", host: "www2.mydomain.com"

Nginx tries all the backends defined in the upstream block, having timeout of 60 seconds. All requests failed and after 3 minutes the client receives 504 gateway timeout. What is strange:

no entry in any of the upstream IIS servers access logs for the failed request (means it seems like it never reached the backend server)
the error above is for the / of the app – just an fast initial page (nothing slow or heavy)
the error is only for specific connection – at the same time hundreds of requests to the same upstreams are processed successfully. When it reproduces for my browser, opening another browser works fine, but no connection from the initial browser is possible
probilng wget http://192.168.101.51:80 working fine
the error appear also in offline time, where the number of requests is very low
adding keepalive in the upstream block partly helped – after adding the number of such errors is very low, but still appear. Playing with different values for keepalive using 16 or 128 did not help.
just found a way to reproduce it – If I send several slow POST requests (that timed out due to slow server side processing), after that the issues can be reproduced. Problem disappear after about 5 minutes. Other browsers working fine. It is not a browser socket issue, as the failed post request are already closed with 504 response.

nginx/1.10.3 (Ubuntu) on a Virtualbox machine, backend is IIS. Application uses signalR (no websockets).

Configuration

user www-data;
worker_processes auto; 
worker_rlimit_nofile 65535;
pid /var/run/nginx.pid;

events {
    worker_connections 65535; 
    use epoll; 
    multi_accept on;
}

http {

    sendfile off; 

    open_file_cache max=200000 inactive=20s;
    open_file_cache_valid 30s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;

    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_http_version 1.1; 
    proxy_set_header Connection '';

    proxy_next_upstream error timeout http_503; 

    fastcgi_buffer_size 128k;
    fastcgi_buffers 4 256k;
    fastcgi_busy_buffers_size 256k;


    proxy_buffer_size   256k; 
    proxy_buffers   16 256k;
    proxy_busy_buffers_size 256k;

    proxy_max_temp_file_size 20m;

    client_max_body_size 20m;
    client_body_buffer_size 20m;

    client_header_buffer_size 128k;
    large_client_header_buffers 4 128k;

    proxy_ignore_headers X-Accel-Expires;
    proxy_ignore_headers Expires;
    proxy_ignore_headers Cache-Control;

    # caching options
    proxy_cache_path /var/cache/nginx/cache levels=1:2 keys_zone=my-cache:8m max_size=1000m inactive=60m;
    proxy_temp_path /var/cache/nginx/tmp;
    proxy_cache_lock on;

    upstream backend_web { 
    hash $lb_key;
    server 192.168.101.51:80 max_fails=3 fail_timeout=10s weight=9;
    server 192.168.101.52:80 max_fails=3 fail_timeout=10s weight=9;
    server 192.168.101.53:80 max_fails=3 fail_timeout=10s weight=9 ;

    keepalive 128;

}

    error_log /var/log/nginx/mydomain.com_error.log error;

    server {

        proxy_cache_key $scheme|$proxy_host|$uri|$is_args|$args;                
        location / {    
            proxy_cache_bypass 1;
            proxy_no_cache 1;    
            proxy_pass http://backend_web;
        }

    }

}

Best Answer

Like Tim pointed, the problem was not in Nginx; Using the network tool Wireshark I was able to see that the request was sent to the IIS, but IIS did not responded; So the behavior of Nginx is correct - seems like my ASP .net application does not open second connection to mysql server, while another connection is still pending (for the same session) and the second request hangs for long time, until the first is finished - but that is not related to the question and i will research further there.

Best Answer

Related Solutions

Nginx errors: upstream timed out (110: Connection timed out)

NGINX + PHP FPM connect() failed (110: Connection timed out) while connecting to upstream

Related Topic