Security Hashing – What Makes a Hashing Algorithm Secure?

hashingSecurity

After reading this interesting question, I felt like I had a good idea of which insecure hashing algorithm I'd use if I needed one, but no idea why I might use a secure algorithm instead.

So what is the distinction? Isn't the output just a random number representing the hashed thing? What makes some hashing algorithms secure?

Best Answer

There are three properties one wants from every cryptographic hash function H:

preimage resistance: Given h, it should be hard to find any value x with h = H(x).
second preimage resistance: Given x1, it should be hard to find x2 != x1 with H(x1) = H(x2).
collision resistance: It should be hard to find two values x1 != x2 with H(x1) = H(x2).

With hash functions as used in common programming languages for hash tables (of strings), usually none of these is given, they only provide for:

weak collision resistance: For randomly (or "typically") selected values of the domain, the chance of collision is small. This says nothing about an attacker intentionally trying to create collisions, or trying to find preimages.

The three properties above are (among) the design goals for every cryptographic hash function. For some functions (like MD4, SHA-0, MD5) it is known that this failed (at least partially). The current generation (SHA-2) is assumed to be secure, and the next one ("Secure Hash Algorithm 3") is currently in the process of being standardized, after a competition.

For some uses (like password hashing and key derivation from passwords), the domain of actually used values x is so small that brute-forcing this space becomes feasible with normal (fast) secure hash functions, and this is when we also want:

slow execution: Given x, it takes some minimum (preferably configurable) amount of resources to calculate the value H(x).

But for most other uses, this is not wanted, one wants instead:

fast execution: Given x, calculating the value of H(x) is as fast as possible (while still secure).

There are some constructions (like PBKDF2 and scrypt) to create a slow hash function from a fast one by iterating it often.

For some more details, have a look at the hash tag on our sister site Cryptography Stack Exchange.

Related Solutions

Security – Predicting the Output of PHP’s rand() Function

The ability to guess the next value from rand is tied to being able to determine what srand was called with. In particular, seeding srand with a predetermined number results in predictable output! From the PHP interactive prompt:

[charles@charles-workstation ~]$ php -a
Interactive shell

php > srand(1024);
php > echo rand(1, 100);
97
php > echo rand(1, 100);
97
php > echo rand(1, 100);
39
php > echo rand(1, 100);
77
php > echo rand(1, 100);
93
php > srand(1024);
php > echo rand(1, 100);
97
php > echo rand(1, 100);
97
php > echo rand(1, 100);
39
php > echo rand(1, 100);
77
php > echo rand(1, 100);
93
php >

This isn't just some fluke. Most PHP versions^* on most platforms^** will generate the sequence 97, 97, 39, 77, 93 when srand'd with 1024.

To be clear, this isn't a problem with PHP, this is a problem with the implementation of rand itself. The same problem appears in other languages that use the same (or a similar) implementation, including Perl.

The trick is that any sane version of PHP will have pre-seeded srand with an "unknown" value. Oh, but it isn't really unknown. From ext/standard/php_rand.h:

#define GENERATE_SEED() (((long) (time(0) * getpid())) ^ ((long) (1000000.0 * php_combined_lcg(TSRMLS_C))))

So, it's some math with time(), the PID, and the result of php_combined_lcg, which is defined in ext/standard/lcg.c. I'm not going to c&p here, as, well, my eyes glazed over and I decided to stop hunting.

A bit of Googling shows that other areas of PHP don't have the best randomness generation properties, and calls to php_combined_lcg stand out here, especially this bit of analysis:

Not only does this function (gettimeofday) hand us back a precise server timestamp on a silver platter, it also adds in LCG output if we request "more entropy" (from PHP's uniqid).

Yeah that uniqid. It seems that the value of php_combined_lcg is what we see when we look at the resulting hex digits after calling uniqid with the second argument set to a true value.

Now, where were we?

Oh yes. srand.

So, if the code you're trying to predict random values from doesn't call srand, you're going to need to determine the value provided by php_combined_lcg, which you can get (indirectly?) through a call to uniqid. With that value in hand, it's feasible to brute-force the rest of the value -- time(), the PID and some math. The linked security issue is about breaking sessions, but the same technique would work here. Again, from the article:

Here's a summary of the attack steps outlined above:

wait for the server to reboot

fetch a uniqid value

brute force the RNG seed from this

poll the online status to wait for target to appear

interleave status polls with uniqid polls to keep track of current server time and RNG value

brute force session ID against server using the time and RNG value interval established in polling

Just replace that last step as required.

(This security issue was reported in an earlier PHP version (5.3.2) than we have currently (5.3.6), so it's possible that the behavior of uniqid and/or php_combined_lcg has changed, so this specific technique might not be workable any longer. YMMV.)

On the other hand, if the code you're trying to product calls srand manually, then unless they're using something many times better than the result of php_combined_lcg, you're probably going to have a much easier time guessing the value and seeding your local generator with the right number. Most people that would manually call srand also wouldn't realize how horrible of an idea this is, and thus aren't likely to use better values.

It's worth noting that mt_rand is also afflicted by the same problem. Seeding mt_srand with a known value will also produce predictable results. Basing your entropy off of openssl_random_pseudo_bytes is probably a safer bet.

tl;dr: For best results, don't seed the PHP random number generator, and for goodness' sake, don't expose uniqid to users. Doing either or both of these may cause your random numbers to be more guessable.

Update for PHP 7:

PHP 7.0 introduces random_bytes and random_int as core functions. They use the underlying system's CSPRNG implementation, making them free from the problems that a seeded random number generator has. They're effectively similar to openssl_random_pseudo_bytes, only without needing an extension to be installed. A polyfill is available for PHP5.

*: The Suhosin security patch changes the behavior of rand and mt_rand such that they always re-seed with every call. Suhosin is provided by a third party. Some Linux distributions include it in their official PHP packages by default, while others make it an option, and others ignore it entirely.

**: Depending on the platform and the underlying library calls being used, different sequences will be generated than documented here, but the results should still be repeatable unless the Suhosin patch is used.

Password Security – Is It More Secure to Hash a Password Multiple Times?

This is more suited on security.stackexchange but...

The problem with

hash1(hash2(hash3(...hashn(pass+salt)+salt)+salt)...)+salt)

is that this is only as strong as the weakest hash function in the chain. For example if hashn (the innermost hash) gives a collision, the entire hash chain will give a collision (irrespective of what other hashes are in the chain).

A stronger chain would be

hash1(hash2(hash3(...hashn(pass + salt) + pass + salt) + pass + salt)...) + pass + salt)

Here we avoid the early collision problem and we essentially generate a salt that depends on the password for the final hash.

And if one step in the chain collides it doesn't matter because in the next step the password is used again and should give a different result for different passwords.

Best Answer

Related Solutions

Security – Predicting the Output of PHP’s rand() Function

Update for PHP 7:

Password Security – Is It More Secure to Hash a Password Multiple Times?

Related Topic