R – Is using 2 different hash functions a good way to check for file integrity

cryptographyhashverification

I have a website where users can upload their files; these are stored on the server and their metadata recorded in a database. I'm implementing some simple integrity checks, i.e. "is the content of this file now byte-for-byte identical as when it was uploaded?"

An example: for content of userfile.jpg, MD5 hash is 39f9031a154dc7ba105eb4f76f1a0fd4 and SHA-1 hash is 878d8d667721e356bf6646bd2ec21fff50cdd4a9. If this file's content changes, but has the same MD5 hash before and after, is it probable that the SHA-1 hash will also stay the same? (With hashing, sometimes you can get a hash collision – could this happen with two different hashing algorithms at once?)

Or is computing two different hashes for a file pointless (and I should try some other mechanism for verifying integrity)?


Edit: I'm not really worried about accidental corruption, but I'm supposed to prevent users changing the file unnoticed (birthday attack and friends).

I'll probably go with one hash, SHA-512 – the checks don't happen that often to be a performance bottleneck and anyway, "As Bruce Schneier says, there's enough fast, insecure systems out there already. –@MichaelGG in the comments".

Best Answer

MD5 is probably safe for what you're doing, but there's no reason to continue to use a hash with known flaws. In fact, there's no reason you shouldn't be usign SHA256 or SHA512, unless you have some known major performance bottleneck.

Edit: To clarify, there's no reason to use two algorithms; just use one that fits what you need. If you're worried about people doing MD5 collisions on you (as in, is this a security threat?), then use an algorithm that isn't as weak, such as SHA256.

Edit 2: To address an apparently still common misunderstanding: Finding a random collision on a hash is not a 1/2^n probability. It's closer to 1/2^(n/2). So a 128-bit hash can probably be collided with 2^64 attempts. See birthday attack for details.

Related Topic