we've been running our web servers at Amazon with load balancer and auto-scaling for over a year with no problem. All of a sudden today the request began to get aborted with the error:
503 … Backend server is at capacity
The web servers are at 1% CPU and no other alarms trigger.
We use Amazons load balancer and nginx.
Lots of requests like this are showing up in the access_log.
10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-"
10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-"
10.229.15.214 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-"
10.229.15.214 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-"
Any thoughts?
Best Answer
You will get a "Back-end server is at capacity" when the load balancer performs its health checks and receives some simple error due to a mis-configuration.
Try to grep the log files for "ELB-HealthChecker", which is the user agent used by the checks.
This will typically give you a 400 or 500 level error which is easily fixed.