Why is the request queueing time so high


I am running a Rails application server. My setup is:

  • Apache 2, using mod_ssl for both https and ssl client certificates
  • Phusion Passenger 5
  • Rails 4
  • Ruby 2.1

I use NewRelic to monitor the running application. I recently enabled monitoring for request queueing delay, mainly out of curiosity. I was surprised to find that the delay in the request queue is often as long or longer than the actual ruby code and database execution time. ~200 ms seems high, right?

Significant request queue delay

Most online information indicates that this happens when the request queue is waiting for a worker to become available, but that isn't the case. As seen below, we're barely using our provisioned instances. During peaks, we rarely go above 30% utilization.

Utilization of worker instances

A few other notes:

  • Apache and Passenger reside on the same server, so there is not a false timing issue due to system clocks being out of sync.
  • Regarding SSL processing, Apache grabs the client SSL certificate and attaches it as a header in the request. The rails application then handles the rest of the processing.

What could be the issue here?

Best Answer

200ms doesn't seem that terrible. The 'request queuing' metric is the measure of time between your web server logs the request and the moment the New Relic agent loads (after before_filters). The way this is measured can make it seem like there is a problem, when one doesn't exist. Your latency is nice and even, with no spikes that would indicate you are running out of workers or starving for resources/CPU. You can use watch passenger-status to check this. You can also double check the server's resource usage locally using Linux utilities:

top, iotop, vmstat, sar (systat)

Still want to hunt for optimizations? Check out anything that executes before the New Relic agent. Possible pain points:

It will take some digging. Good luck!

Related Topic