Application Pool failling with client_reset errors in httperr followed by 503 2 Disabled errors

application-poolscoldfusioniis-7.5

On our server running Server 2008 R2 using IIS 7.5 we have one of our sites that stops responding to requests and eventually produces 503 errors. The site is also running Coldfusion 10. These failures are occurring every few days.

When the problem occurs I start seeing client_reset errors for requests to the site in the httperr log file. These will continue and request made to the site will not respond. No error is displayed the request just hangs. After about ten minutes I will see 503 2 Disabled errors in the httperr log file and attempts to reach the site will also return 503 errors to the end user.

During our attempts to resolve this issue we have also noticed queue is full errors in the httperr log and have increased the Application Pool Queue Length from 1000 to 3000 over time. We have also increased the appConcurrentRequestLimit from 5000 to 10000.

Our workers.properties file for the IIS Connector has the following settings:

worker.cfusion.type=ajp13
worker.cfusion.host=localhost
worker.cfusion.port=8012
worker.cfusion.max_reuse_connections=900
worker.cfusion.connection_pool_size=900
worker.cfusion.connection_pool_timeout=60 

When we look at the successful requests in the general IIS log leading up to an application pool failure we will typically see high requests in short amount of time from a single or similar ip addresses. These appear to be bots of some type, most likely comment spam bots.

At this point, I'm unsure if this is a problem with the Connecter between Coldfusion and IIS, an IIS tuning issue, or a code issue that is causing the client_reset errors to start and the eventual failure of the application pool.

What would be causing the client reset errors and the eventual failure of the application pool? Stopping and starting the application pool in question resolves the issue.

Best Answer

We just had similar issues, where the App Pool would randomly start and need to be restarted... detailed at this bug - https://bugbase.adobe.com/index.cfm?event=bug&id=3490112

the latest CF10 update 18 includes a fix for this, if your problem is the same one. So far this update (And redoing all the connectors after updating) has solved the problem for us, only time will tell if it ever comes back.