Thursday, October 14, 2004

Article - Really random numbers

Michael Silk ( version: 1.0.2, released on: 16th of August, 2004 )

Random numbers are used in almost all encryption algorithms to generate encryption keys and random pad data. Unfortunately, the random number generators (RNG's) provided by most programming SDK's do not provide adequate security.


Goal
To provide a system to obtain "real" random numbers and use them for our security purposes.


Background
Random number generators provided by your typical programming SDK will use the current time as a "seed" to an algorithm which will produce random-looking results. The combination of a seed plus an algorithm to produce such results is known as a pseudorandom number generator (PRNG).

The downside to these generators is the seed. In this case, time. The time is a very predictable entity, and hence not appropriate for any security application. Recognising the predictability of the time, these PRNG's have been improved and modified to gather seed data from a variety of sources. These mulit-source gathering PRNG's are known as "Secure" RNG's due to the proposed unpredictability of their inputs.

All the input to these generators, however, comes from your computer. Due to this, these algorithms can become predictable, depending on the environment.


The Solution
In response to this concern, websites have been created that provide real random numbers on request. By real I mean numbers generated by some (as yet) unpredictable method. The sites I will discuss in this article are:

All of these sites contain in-depth descriptions on how their particular bytes are generated so I will not repeat that information here. I do suggest, however, that before you use the classes provided in this article you take a look at each of the sites and see for yourself the methods they use.

Note: All code provided in this article is in Java and developed under version "1.4.2_03".

Both "Random.org" and "HotBits" provide bytes over HTTP, however "Quantum Random Numbers" are available only through FTP (and the author of the site has provided an API to access to perform the requests). To encapsulate this I created an interface, DistributedRandom and one main super-class called HTTPDistributedRandom.

I have implemented subclasses of these and packaged them up as a 'Distributed Random' library; you can download it here: Distributed Random Pack.zip. Note that the pack also includes the library provided by "Quantum Random Numbers" called "libqrng-1[1].1beta.zip".

I have left the classes relatively bare; they only contain methods to get the bytes and do not do any buffering. It would be appropriate to implement a buffering strategy within, or around, these classes depending on your usage patterns. Please also note that there are some important notes in the source code (prefixed by "note:") so please read them before using them.

Before we plan on using the numbers gathered from each source for security purposes, we need to identify some risks associated with this approach:

  • The source may suffer some sort of failure and no longer produce random numbers
  • The source may be logging the numbers it generates, and to whom
  • The source may be compromised and sending specially crafted numbers

In response to these risks on an individual source I created a class called ReallyRandom. This class attempts to gather a request for bytes from the three listed sources (in fact, it takes an array of DistributedRandom's so it can be any number of distributed sources) and then combines them. The combination method used is a simple XOR of each byte from each source.

The XOR operation gives us a byte array whose contents cannot be known to any of the sources unless they have the bytes generated from other sources. This means that if one source (or two, or three) are logging the bytes that they generate they still cannot discover our resulting byte array. Of course, if all the sites are in cahoots with each other, then we may be in trouble :)

While this combination protects us from the possibly-evil purposes of the sources, it also protects against one source occasionally distributing "bad" random numbers. If a source, for one reason or another, happens to have some form of failure the other sources will compensate.

That comment leads to the question of: How do you identify "bad" random numbers?. Well, luckily there is a nice tool (with full source) provided by the author of "HotBits". The program is called "ent" and you can get it here: http://www.fourmilab.ch/random.

Depending on your environment it may be appropriate to pass the numbers that you generate through the tool (or your own implementation of the algorithms) to see if the results you are getting are truly random. If a certain result is not random, it could indicate a substantial failure in the systems and you could discard that set of bytes and try again or stop processing completely, or whatever is appropriate for your situation.


Problems
Although it is a nice solution to have your random bytes delivered via the these sites there are problems. Most obvious is that none of the sites (currently) provide access to the bytes via SSL. Hence, if your attack model contains an adversary from within your network who would be able to view the bytes as they come in then this solution is not for you :)

Another risk with a local attacker is that of DNS lookup. If your local DNS server was modified to point the various domain names to "evilhq"'s IP address then you are in trouble again.


Conclusion
We have successfully found a method of gathering up real random numbers in preference to the pseudorandoms provided by most of our programming SDK's.

However we have realised that this may not be an acceptable solution if we are concerned about a local attacker.

In the end it is up to you to decide if a distributed source of randoms is an appropriate solution and if the risks outlined apply to your situation; but if you are prepared to accept the risks then these free services may be the perfect solution.

Please note, however that these sites (at least Random.org and HotBits) do not provide unlimited bytes. HotBits imposes a 24-hour limit of an undisclosed number of bytes (discussion of this is in the source code) while Random.org requests we check a buffer to see if it is greater than 20% full before we perform a request. So please, don't overload these free services with requests for bytes.

Feedback? Contact me here.