For the past few days I have been exploring a little bit more the generation of pseudo-random numbers in Lisp. In the past I have done a lot of programming where this was something not to overlook (e.g., evolutionary algorithms). However, since it was mostly stuff to be used by me, I didn’t spend time in trying to find ways to make the random generator component more portable or easy to use. The standard Common Lisp functions on this aspect were more than enough. But I must say they never let me completely happy. The main reason I guess was that I couldn’t provide a seed, just like the other “popular languages” used at the lab, by friends and colleagues. Since now I am decided to implement a more general library for EC and related stuff, I decided to explore other options to use. And at the same time, to explore more of the Lisp world, mainly in terms of libraries.
But before that, let just review what Common Lisp has to offer us. If we don’t need anything very fancy or that requires special care, the standard functions and variables are more than enough. To generate a pseudo-random number we just need to call random with a limit. This limit can be a positive integer or a floating-point number:
CL-USER> (random 100)
CL-USER> (random 1.0)
The result is always a number of the same type between zero (inclusive) and the limit (exclusive). The resulting numbers follow approximately an uniform distribution. For example, if the limit is an integer, the possible results follow a 1/limit probability. The only other thing to know about random is that it has an optional argument. We could think it might be a way to set the seed for the random number generator but it is not quite that. The optional argument in random is a state, an object of type random-state. Within these objects we can find the information that maintains the state of the pseudo-number random generator, since it encodes its internal state. The default current state is kept in the global variable *random-state*. Each time random is called without an explicit state (like in the previous examples) it produces a side effect that changes *random-state*.
When we wish to control the generation process, we can proceed in different ways. For instance, it’s possible to make a copy of the default state and use it later to generate the same sequence of numbers. Other ways might include the creation of several objects of type random-state and use them to generate the numbers. In order to make copies or create new state objects we use the function make-random-state. This function has one optional argument. If the argument is an object of type random-state, it returns a fresh copy of that state. If it’s nil, it returns a copy of the current random state. Finally, if it’s t the it returns a new object.
To copy a random-state object, e.g., *random-state*:
CL-USER> (let ((state-copy (make-random-state nil))) (equalp state-copy *random-state*)) T
Some basic usage of make-random-state:
CL-USER> (let ((new-state (make-random-state t))) (= (random 100 new-state) (random 100))) NIL CL-USER> (let ((current (make-random-state nil))) (= (random 100 current) (random 100))) T
The important aspect to notice here is that, when we call make-random-state with t, the new state is initialized randomly in some way, such as the time clock, process id, etc. As I mentioned, this is the part that I never quite like it since we cannot provide a seed number. As explained in Common Lisp The Language, 2nd Edition:
The reason for this is that the number of bits of state information in a random-state object may vary widely from one implementation to another, and there is no simple way to guarantee that any user-specified seed value will be “random enough.”
Because of this, the initialization of a new random-state object is left to the implementation. So, as previously said, the only way to execute a program, or function, that uses random many times in a reproducible way is to explicit control these random-state objects. For example, if we want to generate the same sequence of numbers we can do the following:
CL-USER> (let* ((state (make-random-state t)) (copy (make-random-state state))) (list (loop repeat 10 collect (random 100 state)) (loop repeat 10 collect (random 100 copy)))) ((26 97 90 80 29 61 30 85 56 19) (26 97 90 80 29 61 30 85 56 19))
If in the second loop the call to random would use the same state or the default one, the generated sequence would be different. The not so great aspect of this process is when we wish to execute a program several times. In short, this implies that we need to save the random-state in a file using print. Each time the program is run, we read it, create a copy of the random-state object from the printed representation and initialize the generator with it.
Saving the random-state:
CL-USER> (with-open-file (stream "random-state.txt" :direction :output :if-exists :supersede) (let ((state (make-random-state t))) (print state stream) (loop repeat 10 collect (random 100 state)))) (47 83 52 6 54 47 52 4 75 76)
Loading the saved state to replicate results:
CL-USER> (with-open-file (stream "random-state.txt" :direction :input) (let ((state (read stream))) (loop repeat 10 collect (random 100 state)))) (47 83 52 6 54 47 52 4 75 76)
Let’s imagine we want to run several runs, for example 30, of an evolutionary algorithm. To be able to replicate those 30 runs we need to create and save 30 random-state objects, use each one for a single run and every time we need to replicate the results we just load the 30 random-state objects. Simple as that!
The other important factor is the generator itself, mainly, what algorithm is used to generate the pseudo-random numbers. And that also implementation dependent. Most of the standard generators implemented are not great for the purposes of scientific work and researchers usually look for external libraries or implementations of good generators. Especially if we are talking about languages which are not Common Lisp. A little because of that, I must confess I didn’t pay to much attention to what algorithm Lisp implementations use. In EC computing (and probably most other areas) prefer to use the Mersenne Twister (MT) generator because it’s fast and produces good quality pseudo-random numbers. I was aware that CMUCL implemented it but for some strange reason, I assumed that SBCL didn’t (which is a weird assumption I must say). Anyway, I was glad to confirm that SBCL also uses MT! One of the reasons I started to look at GSL/GSLL was precisely because of the generators. GSL/GSLL has lots of generators to choose from and that is by itself a very good thing but it will be interesting to compare their MT and SBCL’s one. But that’s for another post.