I have a nodejs app on forever running behind nginx. When I deploy new code I just do forever restart
but I can't afford to get 502 even in that short time.
How to configure nginx to keep retrying at this one upstream server in case of 502? I tried setting proxy_connect_timeout
, proxy_read_timeout
and proxy_send_timeout
to e.g. 30s
but I immediately get 502 no matter what 🙁
My site conf is:
upstream my_server {
server 127.0.0.1:3000 fail_timeout=0;
keepalive 1024;
}
server {
listen 3333;
server_name myservername.com;
access_log /var/log/nginx/my_server.log;
location / {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-NginX-Proxy true;
# retry upstream on 502 error
proxy_connect_timeout 30s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
proxy_pass http://my_server;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_redirect off;
}
}
Is it possible to buffer requests for this short time when upstream is not available?
Best Answer
This ultimately sounds like a problem with your backend: nginx makes a request to your backend, the connection is refused right away, so, nginx has no other option that to pass on an error downstream to the user, since no other upstream is specified, and the
timeout
values you specify play no effect here, since nginx doesn't have to wait for anything at all.I don't now what
forever
is, or how it works, but there are a couple of possible solutions that come to mind.There are two possibilities of what is happening on the upstream side:
"Forever" might be accepting the connection, and returning an error immediately. In this case, it sounds like what you really should be asking is how to make it not mishandle the connection, but wait until your app is finished deploying, and process the request then and there. The
opengrok
app on thetomcat
server has this issue.Noone is listening on the port where your app is supposed to run, so, the kernel is immediately dropping the packet, and returning a TCP RST packet right away.
If
TCP RST
is the cause, you can solve it by havingforever
keep the listening socket, or by configuring the kernel to queue incoming packets for a certain time, in anticipation of someone picking them up later on, so that whenforever
does start back up, it'll have a whole queue ready for servicing.Configure the kernel to not issue
TCP RST
when noone's listening, then yourtimeout
in nginx will have an effect. Subsequently, configurenginx
to make a second request to another upstream.If you address either one of the above cases, you're done.
Else, you have to try to configure nginx to fix the issue:
You could try
proxy_cache
withproxy_cache_use_stale
.You could try to employ the error handler: see
proxy_intercept_errors
(probably applies only if the 503 that you're getting is passed from your backend) anderror_page
. You would want to waste time in the error handler, until your app comes back up, then make a request to your app.sleep()
for however long it takes for your app to be redeployed, then either provide an HTTP redirect, or quit without a response; heck, you could simply implement this by attempting to proxy to a TCP port which youblock drop
in the firewall, which would activate your timeouts in nginx. Subsequently, configure nginx to make a second request.If you implement one of the above approaches that involves
timeout
activation, then it would subsequently require that an extra request to the backend is made. You could use theupstream
directive for that, where you would either specify the same server multiple times, or, if not accepted, can mirror a port through your firewall, or, better yet, you could actually run multiple independent app servers in the first place.Which brings us back to your app server: if it cannot handle the issue of clean re-deployment, then maybe you should run two of such app servers, and use nginx to load-balance between them. Or deploy them anew, and then switch nginx to the new copy once it's actually ready. Else, how could you possibly be sure that your clients would be willing to wait even 30 s for your API to respond?