问题描述:

I have been reading various articles about random numbers and their generators. There are usually 3 important conclusions that I draw from them:

- Random numbers are not truly random
- Much of the time they have a bias (modulo bias)
- Humans are incapable of being random number generators, when they are trying to "act randomly"

So, with the latter-most of these observations in mind, how would we be able to

- Tell if a sequence of numbers that we see is truly random, and more importantly
- Is there some way we can prove that said sequence is really random?

I'm tempted to say that so long as you generate a sufficiently large enough sample set 1,000,000+, you should see more or less a uniform dispersion of (pseudo)random numbers occur. However, I'm sure some Maths genius has a way of discrediting this, because surely the by laws of probability you could get a run of one number just as likely as any other sequence.

From what I have read, if you really need random numbers its best to try and reuse what cryptographic libraries use. The field of Cryptography is obviously complex and relies on random numbers for key generation. From the section in OWASP's guide titled "Reversible Authentication Tokens" it says this...

The only way to generate secure authentication tokens is to ensure there is no way to predict their sequence. In other words: true random numbers.

It could be argued that computers can not generate true random numbers, but using new techniques such as reading mouse movements and key strokes to improve entropy has significantly increased the randomness of random number generators. It is critical that you do not try to implement this on your own; use of existing, proven implementations is highly desirable.

Most operating systems include functions to generate random numbers that can be called from almost any programming language.

My take is that unless you're coding Cryptographic libraries yourself, put trust in those that are (e.g. use Java Cryptography Extension) so you don't have to proove it yourself.

Pretty Simple Test:

If you really want to get into testing random numbers, you could simulate a program that outputs random numbers from 1-100 100 times as an example. Then look at those numbers and see if there's any patterns. Then follow that test by restarting the program several times and repeating the process. Examine all data to figure out if random numbers are always random, just random during individual tests, or never. :P

Testing a random number generator is probably mostly up to what you want to look for. Even pure non-repeatability is no guarantee of randomness.

There are some companies that will test a random number generator for the purposes of certification (e.g. online casinos). One that I found quickly is called iTech Labs, though their testing methodology page leaves a lot to be desired in terms of technical detail.

Other testers and certification bodies publish the required data for a certification; there's more specific detail here but not as much as you want.

You could potentially do a statistical analysis and compare the results of your random number generator to a "true" random source but the argument could be made for bias from trying to translate the true random source into your possibility space anyway.

Randomness tests verify the mathematical properties of the sequence. For example entry frequencies (all symbols are expected to have the same frequency), local variance, sequence analysis (the probability of a symbol must not depend on the previous ones). A definite proof does not exist, but there is a quality factor - the probability of a sequence to really be random. Another criterion could be based on compressibility: true randomness has maximum entropy and can not therefore be compressed. This test is not reliable for randomness, of course, but allows quick and dirty testing with ready tools such as zlib.