I've read in numerous sources that the output of PHP's rand() is predictable as its a PRNG, and I mostly accept that as fact simply because I've seen it in so many places.
I'm interested in a proof-of-concept: how would I go about predicting the output of rand()? From reading this article I understand that the random number is a number returned from a list starting at a pointer (the seed) — but I can't imagine how this is predictable.
Could someone reasonably figure out what random # was generated via rand() at a given moment in time within a few thousand guesses? or even 10,000 guesses? How?
This is coming up because I saw a auth library which uses rand() to produce a token for users who have lost passwords, and I assumed this was a potential security hole. I've since replaced the method with hashing a mixture of openssl_random_pseudo_bytes()
, the orignal hashed password, and microtime. After doing this I realized that if I were on the outside looking in, I'd have no idea how to guess the token even knowing it was a md5 of rand().
Best Answer
The ability to guess the next value from
rand
is tied to being able to determine whatsrand
was called with. In particular, seedingsrand
with a predetermined number results in predictable output! From the PHP interactive prompt:This isn't just some fluke. Most PHP versions* on most platforms** will generate the sequence 97, 97, 39, 77, 93 when
srand
'd with 1024.To be clear, this isn't a problem with PHP, this is a problem with the implementation of
rand
itself. The same problem appears in other languages that use the same (or a similar) implementation, including Perl.The trick is that any sane version of PHP will have pre-seeded
srand
with an "unknown" value. Oh, but it isn't really unknown. Fromext/standard/php_rand.h
:So, it's some math with
time()
, the PID, and the result ofphp_combined_lcg
, which is defined inext/standard/lcg.c
. I'm not going to c&p here, as, well, my eyes glazed over and I decided to stop hunting.A bit of Googling shows that other areas of PHP don't have the best randomness generation properties, and calls to
php_combined_lcg
stand out here, especially this bit of analysis:Yeah that
uniqid
. It seems that the value ofphp_combined_lcg
is what we see when we look at the resulting hex digits after callinguniqid
with the second argument set to a true value.Now, where were we?
Oh yes.
srand
.So, if the code you're trying to predict random values from doesn't call
srand
, you're going to need to determine the value provided byphp_combined_lcg
, which you can get (indirectly?) through a call touniqid
. With that value in hand, it's feasible to brute-force the rest of the value --time()
, the PID and some math. The linked security issue is about breaking sessions, but the same technique would work here. Again, from the article:Just replace that last step as required.
(This security issue was reported in an earlier PHP version (5.3.2) than we have currently (5.3.6), so it's possible that the behavior of
uniqid
and/orphp_combined_lcg
has changed, so this specific technique might not be workable any longer. YMMV.)On the other hand, if the code you're trying to product calls
srand
manually, then unless they're using something many times better than the result ofphp_combined_lcg
, you're probably going to have a much easier time guessing the value and seeding your local generator with the right number. Most people that would manually callsrand
also wouldn't realize how horrible of an idea this is, and thus aren't likely to use better values.It's worth noting that
mt_rand
is also afflicted by the same problem. Seedingmt_srand
with a known value will also produce predictable results. Basing your entropy off ofopenssl_random_pseudo_bytes
is probably a safer bet.tl;dr: For best results, don't seed the PHP random number generator, and for goodness' sake, don't expose
uniqid
to users. Doing either or both of these may cause your random numbers to be more guessable.Update for PHP 7:
PHP 7.0 introduces
random_bytes
andrandom_int
as core functions. They use the underlying system's CSPRNG implementation, making them free from the problems that a seeded random number generator has. They're effectively similar toopenssl_random_pseudo_bytes
, only without needing an extension to be installed. A polyfill is available for PHP5.*: The Suhosin security patch changes the behavior of
rand
andmt_rand
such that they always re-seed with every call. Suhosin is provided by a third party. Some Linux distributions include it in their official PHP packages by default, while others make it an option, and others ignore it entirely.**: Depending on the platform and the underlying library calls being used, different sequences will be generated than documented here, but the results should still be repeatable unless the Suhosin patch is used.