Why use deflate instead of gzip for text files served by Apache?
The simple answer is don't.
RFC 2616 defines deflate as:
deflate The "zlib" format defined in RFC 1950 in combination with the "deflate" compression mechanism described in RFC 1951
The zlib format is defined in RFC 1950 as :
0 1
+---+---+
|CMF|FLG| (more-->)
+---+---+
0 1 2 3
+---+---+---+---+
| DICTID | (more-->)
+---+---+---+---+
+=====================+---+---+---+---+
|...compressed data...| ADLER32 |
+=====================+---+---+---+---+
So, a few headers and an ADLER32 checksum
RFC 2616 defines gzip as:
gzip An encoding format produced by the file compression program
"gzip" (GNU zip) as described in RFC 1952 [25]. This format is a
Lempel-Ziv coding (LZ77) with a 32 bit CRC.
RFC 1952 defines the compressed data as:
The format presently uses the DEFLATE method of compression but can be easily extended to use other compression methods.
CRC-32 is slower than ADLER32
Compared to a cyclic redundancy check of the same length, it trades reliability for speed (preferring the latter).
So ... we have 2 compression mechanisms that use the same algorithm for compression, but a different algorithm for headers and checksum.
Now, the underlying TCP packets are already pretty reliable, so the issue here is not Adler 32 vs CRC-32 that GZIP uses.
Turns out many browsers over the years implemented an incorrect deflate algorithm. Instead of expecting the zlib header in RFC 1950 they simply expected the compressed payload. Similarly various web servers made the same mistake.
So, over the years browsers started implementing a fuzzy logic deflate implementation, they try for zlib header and adler checksum, if that fails they try for payload.
The result of having complex logic like that is that it is often broken. Verve Studio have a user contributed test section that show how bad the situation is.
For example: deflate works in Safari 4.0 but is broken in Safari 5.1, it also always has issues on IE.
So, best thing to do is avoid deflate altogether, the minor speed boost (due to adler 32) is not worth the risk of broken payloads.
As I see the cause was already found. To further on help getting out of possible confusions:
deflate - despite its name the zlib compression (RFC 1950) should be used (in combination with the deflate compression (RFC 1951)) as described in the RFC 2616. The implementation in the real world however seems to vary between the zlib compression and the (raw) deflate compression[3][4]. Due to this confusion, gzip has positioned itself as the more reliable default method (March 2011).
gzip and zlib are file/stream formats that by default wrap around deflate and amongst other things add a checksum which make them more secure and a little slower. The increase in size on the other hand should not be of any concern.
Also see HTTP_compression - Wikipedia
Best Answer
There is some confusion about the naming between the specifications and the HTTP:
But the HTTP uses a different naming:
So to sum up:
gzip
is the GZIP file format.deflate
is actually the ZLIB data format. (But some clients do also accept the actual DEFLATE data format fordeflate
.)See also this answer on the question What's the difference between the "gzip" and "deflate" HTTP 1.1 encodings?: