Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Computer Random vs. True Random : See the difference in how random numbers are generated. (boallen.com)
39 points by e1ven on May 20, 2008 | hide | past | favorite | 31 comments


it's obvious the PHP's rand() function is using the operating system's random number generator, because as the results from Window's are awful (the image doesn't change when re-running the script), the results from the Linux environment are far superior.

Chances are, the Windows version not changing is a feature, not a bug. Random number generators require a seed. The same seed always provides the same stream of pseudo-random numbers. This is useful if you're, say, debugging a simulation that depends on pseudo-random numbers.

I don't do Windows system programming, but I'll be there's a way to change the seed, and then change the stream of numbers. So, I find it more likely that the people who implemented php on Windows were using the generator incorrectly, if the intention was different number streams.


This is exactly right. The windows implementation of ANSI C89 rand() was done with a 16 bit linear congruential generator back in the win16 days, and left unchanged for compatibility reasons. It's really not an insane choice on their part. Lots of applications really do rely on PRNG repeatability, and you wouldn't want a platform upgrade to interfere with that.

The real lesson is that applications with solid randomness requirements need to be relying on their own generators (there are lots to choose from) or an OS-supplied RNG (e.g. /dev/urandom) and not ANSI C, which is underspecified.


No it depends on the language implementation, and in the PHP documentation you can read that at every execution the seed is automatically changed. So while in C you may need srand(time(NULL)) in order to get a different set of pseudo random numbers for every program execution in PHP you need srand(CONSTANT) in order to get always the same set.

Conclusion: PHP is buggy on windows if it does not change the seed at every execution like it is stated in the specification (documentation actually that == specification for PHP).


generally you prime the generator with the current system time


Pay close attention to the bit about being generated on a WAMP system. This is clearly an issue with PHP using a faulty random-number generator on the Windows platform, and not a general fault that necessarily exists in all programming languages' random number generators. (Just for the record: Windows has a number of random number generator functions in its cryptographic libraries, some of which are on-par with those that ship with *nix. It's just that PHP on Windows isn't using them the right way)


Yes, glibc in particular uses a very nice PRNG with 992 bits of internal state and good randomness of high and low bits, though it is not cryptographically strong like /dev/urandom. I don't know what one gets on Windows, but from the graph it looks like it's got some tricky squaring thing in it that turns out not to be very good.


"There are some tricks to improve the randomness of the computer generated numbers... For instance, you could have the program sleep for a randomly generated amount of milliseconds before grabbing the next random number... or add, subtract, divide, and multiply the random numbers with other random numbers."

I'm no PRNG expert but I think I read somewhere that those techniques wouldn't significantly improve the quality of pseudo-randomness, if at all. You can't mix 2 bad wines to get a good one.


If you just want a PRNG for a game, then sending the output of a bad PRNG through a hash function like MD5 will almost always give you good results. That would've been better advice. Also, the mouse is a readily available source of randomness which can also suffice for a game. As far as PRNGs for cryptographic purposes, that's something else entirely.

(By conincidence, I am going to be writing a game which uses cryptographically secure algorithms to generate random numbers for a procedurally generated game world. (A Galaxy of 400 Billion Stars!)


If the PRNG is really bad, then even this won't work. But PRNGs that bad are generally deliberately made that bad.


I had glanced over that paragraph, and indeed, it's wrong. One, the stream only changes if the seed is dependent on the system time. Two, if you modify the result with another pseudo-random number, then you have a function whose randomness if as only as good as its inputs.


Is there really anything that is truly random? I've always wondered about this. If you define random as meaning "when rand() is called many times in a row, you will most probably get a different number each time" then both the Perl function and random.org are random, but to different extents.

But if you define random to mean something like "caused by unexplained or mysterious forces" or "non-deterministic", then how can anything in the universe be truly random? Even atmospheric noise is created by a sequence of deterministic events, right?

A dice roll is too (brain fires synapse, I move my hand forward and release the dice, dice interact with air in certain way, dice hit table and F=ma predicts their movements, etc...). Maybe since we don't have the facilities yet to watch a person throw some dice and predict what the outcome will be, this process is non-deterministic simply because we lack the technology to sketch it out. But with sufficient technology to scan the situation and chart the end result of the dice roll based on the initial conditions, doesn't the process become completely deterministic, and not random?

But maybe I'm just stretching the definition of random to mean something that it was never meant to mean.


Statistical randomness is pretty well defined. Say you have a set of independent and identically distributed (iid) random numbers, then if you plot a long run histogram it should resemble the probability density function of the distribution. Irrespective of whether it was generated by a pseudo-random algorithm or by quantum disturbances, your set of numbers is statistically random in a practical sense.

Your question seems more philosophical, though I think at the quantum level things become more or less random whichever way you think about it. Furthermore, Heisenberg's uncertainty principle suggests that we'll never have sufficient technology to measure both the position and momentum of a particle, which I would say is a pre-condition to observing a deterministic process at work on the quantum scale.


You are in a state of sin :-) Don't talk about random sequences, talk about random generators. There's an unambiguous definition of "true random number generator" in terms of probabilities, joint probabilities etc., but there's no way to determine if a given string of numbers is random, and in fact no definition of a "random string of numbers" other than "the output of a random number generator".


I am reminded of Borges's "Pierre Menard, Author of the Quixote" (http://www.coldbacon.com/writing/borges-quixote.html) (and what do you know, Borges even slips George Boole into it).

"He did not want to compose another Quixote —which is easy— but the Quixote itself. Needless to say, he never contemplated a mechanical transcription of the original; he did not propose to copy it. His admirable intention was to produce a few pages which would coincide—word for word and line for line—with those of Miguel de Cervantes."


A more informed and informative source on the subject:

http://www.schneier.com/blog/archives/2006/06/random_number_...


People have been generating this kind of graph of years, often with more meaningful techniques. For example, you can get another dimension of information by looking at the difference between one number and the last one generated, and plotting it on the z- or color axis.


However, even that isn't enough to measure true randomness. What this article doesn't mention is that you quickly run into trouble trying to visualise non-random behaviour in RNGs. At the higher end, the analysis isn't quite so easy.


Yeah, you need actual math to get anywhere. The pictures are mostly for prettiness and visual proof of non-randomness. But even with math you aren't measuring "true randomness". The best you can do is to confirm statistical properties that random information would also have. It's impossible to tell in general whether a string is "true randomness" or, say, a sequence of bits from pi.


FFT and Chi-square tests, simple bits/bytes distributions are all more interesting tests probably but actually a good generator will pass all this tests while being predictable. The only way to check if a PRNG is of cryptographically use is to analyze the way and sources used to generate the output.


As the author points out, php's rand function is garbage on some systems. mt_rand (based on the Mersenne twister) has been the suggested replacement for many years; it's built in, fast, and consistent across platforms.


Indeed, the first thing I thought was "hey that's PHP! Oh, did he just use rand()?" It ought to be common knowledge in the PHP world that you always use mt_rand().

On OSX, the results aren't very noticeable, probably because the underlying libs are higher quality than Windows, but mt_rand() definitely produces something comparable to his random.org image.

Not sure why rand() isn't simply replaced with mt_rand() under the hood, since improving randomness will hardly break b/c... Maybe PHP9 will have that ;)


Any reputable gaming site uses an Intel hardware random number generator.


It's an interesting article and comparison, but (in general) I think people tend to obsess a bit too much over "true" randomness. I am not worried about my random dice having a seemingly repetitive pattern. If I am flipping a coin or even rolling dice, I might observe certain reoccurring patterns in the results of the coin flip -- perhaps out of sheer randomness.


The right question to ask is whether a source is random enough for your purposes. The weaker source might be ok for a game, but the patterns in the visualization were certainly not caused by chance. That level of predictability would not be ok for crypto.


Games are surprisingly demanding when it comes to random number generation. Especially in the realm of computer graphics, the human brain is excellent at picking out underlying patterns which can result from insufficiently random data. Even with a good generator, mishandling of the results can ruin things easily.


For comparison:

rand() on Linux: http://www.therealnewsfeed.com/randimg.php?fn=rand

mt_rand() on Linux: http://www.therealnewsfeed.com/randimg.php?fn=mt_rand

To me, at least, these are pretty indistinguishable from the "true random" image, so it seems like the bigger issue here is the faulty rand() on Windows.


Excellent visualization!

It would appear that there is some kind of flaw in the pseudo-random generator used, since once can see definite patterns in the output.


Any pseudorandom stream generator is periodic, with the period at best proportional to 2^n, where n is the number of bits in the internal state of the generator. But you're right that the period in the example seems way too small. A correctly chosen 32-bit linear-feedback shift register (LFSR), would have a period of 2^32 (about 4 billion), which would be hard to see.


Yeah, you can see it has a small period from the presence of a repeated pattern. I think if he had used the Mersenne twister algorithm the output would be indistinguishable.


That would be a cool project-- to compare the patterns for various pseudo-random number generator algorithms...


Couldn't you see the grid in the first one? It's in the middle, you can tell by the density. There are 9 sectors in it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: