Incorrect gzipping of http requests, can’t find who’s doing it

We're seeing some very strange mangling of HTTP responses, and we can't figure out what is doing it. We have an app server handling JSON requests. Occasionally, the response is returned gzipped, but with incorrect headers that prevent the browser from interpreting it correctly.

The problem is intermittent, and changes behavior over time. Yesterday morning it seemed to fail 50% of the time, and in fact, seemed tied to one of our two load-balanced servers. Later in the afternoon, it was failing only 20 times out of 1000, and didn't correlate with an app server.

The two app servers are running Apache 2.2 with mod_wsgi and a Django app stack. They have identical Apache configs and source trees, and even identical packages installed on Red Hat. There's a hardware load balancer in front, I don't know the make or model.

Akamai is also part of the food chain, though we removed Akamai and still had the problem.

Here's a good request and response:

* Connected to example.com (97.7.79.129) port 80 (#0)
> POST /claim/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: example.com
> Accept: */*
> Referer: http://example.com/apps/
> Accept-Encoding: gzip,deflate
> Content-Length: 29
> Content-Type: application/x-www-form-urlencoded
> 
} [data not shown]
< HTTP/1.1 200 OK
< Server: Apache/2
< Content-Language: en-us
< Content-Encoding: identity
< Content-Length: 47
< Content-Type: application/x-javascript
< Connection: keep-alive
< Vary: Accept-Encoding
< 
{ [data not shown]
* Connection #0 to host example.com left intact
* Closing connection #0
{"msg": "", "status": "OK", "printer_name": ""}

And here's a bad one:

* Connected to example.com (97.7.79.129) port 80 (#0)
> POST /claim/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: example.com
> Accept: */*
> Referer: http://example.com/apps/
> Accept-Encoding: gzip,deflate
> Content-Length: 29
> Content-Type: application/x-www-form-urlencoded
> 
} [data not shown]
< HTTP/1.1 200 OK
< Server: Apache/2
< Content-Language: en-us
< Content-Encoding: identity
< Content-Type: application/x-javascript
< Content-Encoding: gzip
< Content-Length: 59
< Connection: keep-alive
< Vary: Accept-Encoding
< X-N: S
< 
{ [data not shown]
* Connection #0 to host example.com left intact
* Closing connection #0
�V�-NW�RPR�QP*.I,)-���A���̼�Ԣ����T��Z�
��/

There are two things to notice about the bad response:

It has two Content-Encoding headers, and the browsers seem to use the first. So they see an identity encoding header, and gzipped content, so they can't interpret the response.
The bad response has an extra "X-N: S" header.

Perhaps if I could find out what intermediary adds "X-N: S" headers to responses, I could track down the culprit…

Best Answer

Some additional clues

According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5:

identity The default (identity) encoding; the use of no transformation whatsoever. This content-coding is used only in the Accept- Encoding header, and SHOULD NOT be used in the Content-Encoding header.

Akamai seems to be ignoring the fact that a server could include this header in their response and is not removing it when the change the encoding to "gzip".

Since the upstream server "should" not be adding the header in the first place, that is another way to fix this problem.

Best Answer

Related Solutions

Apache – How to Dump Entire HTTP Requests

Apache won’t serve images larger than ~2K

Related Topic