Nginx request_time slower with HTTP/2

httphttp2nginx

We are running nginx/1.9.10 as a frontend server with multiple application server as upstream. We are using plain http, mostly https and switched to http/2 in the last weak.

We are logging like this:

 log_format custom '$host $server_port $request_time '
                   '$upstream_response_time $remote_addr '
                   '"$http2" $upstream_addr $time_iso8601 '
                   '"$request" $status $body_bytes_sent '
                   '"$http_referer" "$http_user_agent"';

We suddenly see a larger difference between $request_time and $upstream_response_time . A difference here is quite natural as the $request_time depends on the users network while the upstream_response_time does not.

So usually you should not care too much about $request_time as long as $upstream_response_time is stable.

But I still wanted to check what is happening because it strted to getting worse with http/2 So I compared average response times for https/1.1 and https/2.0

First I cecked all http/1.1 request and calculated average response_time and average upstream_time:

grep ' 443 ' access.log|grep 'HTTP/1.1'|\ 
cut -d ' ' -f 3,4 | awk '{r+=$1; u+=$2} END {print r/NR; print u/NR}'

0.0139158  # average response time for https/1.1
0.00691421 # average upstream time for https/1.1

Now I did the same with https/2.0:

grep ' 443 ' access.log|grep 'HTTP/2.0'| \
cut -d ' ' -f 3,4 | awk '{r+=$1; u+=$2} END {print r/NR; print u/NR}'
0.0828755  # average response time for https/1.1
0.00606643 # average upstream time for https/2.0

As you see the upstream time is nearly identical but the request time is slower for http/2 by factor 7! Wow! Isn't http/2 expected to be faster?

Now I checked all request which have a huge difference between these two values and nearly all of the top 500 has been a status code of 302

grep ' 443 ' access.log | grep 'HTTP/1.1' | \
awk '{ if ( $3 != $4 && $4 != "-" ) { \
  gsub(/\./,"",$3);gsub(/\./,"",$4); \
  print $3-$4,$4,$6,$9,$11,$12 }}' | \
sort -n | tail -n 10000 | grep 'POST HTTP/1.1" 302' | wc -l

9008 
# of 10000 (or 90%) request ordered by difference between 
# response and request time have status code 302  

So 90% of all requests with the highest difference between response and upstream time are status code 302. This is strange

On http/2 it is even worse:

grep ' 443 ' access.log | grep 'HTTP/2.0' | \
awk '{ if ( $3 != $4 && $4 != "-" ) { \
  gsub(/\./,"",$3);gsub(/\./,"",$4); \
  print $3-$4,$4,$6,$9,$11,$12 }}' | \
sort -n | tail -n 10000 | grep 'POST HTTP/2.0" 302' | wc -l

9790
# of 10000 (or 98%) request ordered by difference between 
# response and request time have status code 302  

So here are 98% of these request are 302 status.

Why does http/2 appear to be slower than http/1.1?
Why are so many 302 status codes involved in requests were upstream and response time differ extremly (in HTTP/1.1 and HTTP/2.0)?

Best Answer

I liked the quest very much, and I analyzed it.

I took these resources mostly: [https://www.nginx.com/blog/nginx-1-9-5/ ] [https://www.nginx.com/blog/http2-module-nginx/ ] [https://blog.cloudflare.com/introducing-http2/ ] [https://http2.github.io/faq/ ] and this PDF

You probable downloaded nginx/1.9.10 from Nginx.org http://nginx.org/en/download.html. Please confirm. Have you built it from source? Have you tried the nginx/1.9.12?

As you may see in CloudFlare article when measured Average Page Load time HTTP/2 is clearly the winner (at least for their web site).

Access via HTTP Protocol Version    Average Page Load time
HTTP 1.x                            9.07 sec.
SPDY/3.1                            7.06 sec.
HTTP/2                              4.27 sec. 

However, when tracking average response time, please check out the second article from Nginx.org I set as a reference where it reads:

The next key point of HTTP/2 is multiplexing. Instead of sending and receiving responses and requests as separate streams of data over multiple connections, HTTP/2 multiplexes them over one stream of bytes or one stream of data. It slices data for different requests and for different responses, and each slice has its own identification and its size field, which is there so the endpoint can determine which data belongs to which request.

The disadvantage here is that since each chunk of data has its own identification, its own fields, there's some metadata that transfers in addition to the actual data. So, it has some overhead.

http/2 socket

The other problems may be related to your nginx setup, for instance inside the server block.

Related Topic