Ny value in checking MD5 of files after they have been unzipped

compressiondata integritymd5

A project I am working on currently requires that the user runs an MD5 hash checking tool on the entire project, after it has been unzipped. They do not currently request that the ZIP itself is checked.

If they were to switch to checking the MD5 of the zip, would there be any value in verifying the integrity of the unzipped files with MD5 – or is this covered by CRC checks when unzipping?

Best Answer

There is the possibility of your decompression software doing something strange, or the data being corrupted in the storage step. For highly critical data you should always verify after storing it to the disk.

In practice zip/unzip are old programs and the risk of a bug in the Zip program being shipped with your linux is rather low. This is primarily a concern on unstable platforms or when there is a problem with the storage. I have have seen routers corrupt images when decompressing them, and failed writes over NFS can cause interesting file corruption.

If you think somebody might craft an "evil" zip archive to bypass your checks the situation is a bit different. Note that CRC in the zip provides no protection against an attacker, and that MD5 is a rather old and weak algorithm. Most systems are shifting towards the SHA algorithms to verity file integrity instead (SHA256 being most popular I think). Hashing the archive and the expanded files makes an attack on MD5 a lot harder.