Does gunzip work in memory or does it write to disk

compressiongzip

We have our log files gzipped to save space. Normally we keep them compressed and just do

gunzip -c file.gz | grep 'test'

to find important information but we're wondering if it's quicker to keep the files uncompressed and then do the grep.

cat file | grep 'test'

There has been some discussions about how gzip works if it would make sense that if it reads it into memory and unzips then the first one would be faster but if it doesn't then the second one would be faster. Does anyone know how gzip uncompresses data?

Best Answer

It's always going to be quicker to cat the uncompressed file as there's no overhead associated with that. Even if you're not writing a temporary file, you're going through the decompression motions, which munch CPU. If you're accessing these files often enough, it's probably better to keep them uncompressed if you have the space.

That said, dumping data to standard out (gunzip -c, zcat, etc...) won't trigger writing to a temporary file. The data is piped directly to the grep command, which treats the uncompressed stream as it's own standard in.

The Wikipedia article on LZ* encoding is here: http://en.wikipedia.org/wiki/LZ77_and_LZ78.

Related Topic