Jorge Tavares weblog

Starting with Haskell

with 4 comments

haskell-logo-nobg

Around ten years ago I learned to program in Lisp, Common Lisp to be more precise. And since then I believe I never actually learned a new language that impressed me, and excited me, so much as was the experience of learning Lisp.

Well, yesterday I started to learn more seriously Haskell. For quite some time my interest in the language has been increasing. From several blogs posts, comments and discussions on the web, it is almost impossible not to be curious. Yes, I’ve heard of Haskell for a very long time but it seems the buzz on my ears raised a lot recently. And so, I’ve decided to take a look. In the worst case scenario, it would a be just a few hours of playing around with a new language.

Up until now, I didn’t do much. Installed the Haskell platform, haskell-mode in Emacs (although I ended up using Vim too to code) and searched for some basic tutorial. My starting choice is “Learn You a Haskell for Great Good!”. It looks nice. Seems suitable for a beginner and starts in a simple way. The first real chapter is the second one, which I just did yesterday. I was very impressed with it. And perhaps more important, very happy with the experience. To be honest, I don’t remember a similar feeling with other languages… except Lisp! Although this “Starting Out” chapter introduces just some very basic Haskell stuff, it is already a lot to feel the language and the potential of it. The last example was very cool. Suppose you want to find the right triangle that has integers equal or smaller than 10 for all sides and has a perimeter of 24. With tuples, list comprehensions it’s just a matter of writing:

[(a,b,c) | c <- [1..10], b <- [1..c], a <- [1..b], a^2 + b^2 == c^2, a + b + c == 24]

Simple, concise and very neat! I still didn’t do anything proper with Haskell. I am just in the very very very beginning but it looks promising. I could be wrong about it sure, but the feeling is nice.

Also, recently, I started to take a look at Clojure, another Lisp. I had some hopes that would be a very nice experience. Unfortunately, it was not. For some reason, I am unable to appreciate the “beauty” of a JVM language. But let’s see how Haskell will fare when I start to try it with the kind of stuff I’m interested in.

Written by Jorge Tavares

October 16, 2009 at 14:26

Posted in Programming

Tagged with

More random number generators in Lisp: using GSLL

with 3 comments

In my previous post, I wrote an overview of this simple, but important, component which is the generation of pseudo-random numbers, using Common Lisp. However, sometimes you need a little bit more. Naturally, an implementation only offers one generator and no standard facilities exist to change. If you want a different generator the only options are to do it yourself or to use one from a library. Depending on one’s person needs, implementing our own generator might be or not an option. And using a library too. First, there are not many libraries containing different generators (at least, to the best of my knowledge). We can find code of other generators in Lisp but that’s very different from having a good, portable and complete library with everything (or most) we might need. As such, as far as I know, the best option is to use GSLL which allows us to use in Lisp the many things GSL has. GSL provides a unique set of pseudo-random numbers generators with the big advantage that they can be used later with random number distributions.

How can we use this facility that GSL/GSLL gives us? Well, first we need to install both libraries, as said before. As soon as everything is up and running, working with the library is not difficult. So far, I am not aware of a manual/tutorial for the Lisp side of GSL but as the website advises, we first look at the GSL documentation the name of the function that we want, then we use both gsl-lookup and documentation functions to learn about the Lisp equivalent. Looking at the examples might be also very useful. However, here I will focus just on the Lisp functions.

The first thing we need is to create an object that contains our pseudo-random generator. The object is named random-number-generator and we can create it using make-random-number-generator. This function requires an argument indicating the type of the generator. This is one of the big pluses of using GSLL. We can create several objects with different generators and the list provided by GSL is large (there is a function all-random-number-generators but it only returns the pointers, not the Lisp constant symbols). But for the moment, let’s focus on the essentials and assume that we also want a Mersenne Twister generator. This type of generator is represented by the constant +mt19937+. For example, if we want to define *rng* as a global variable that contains our generator from GSLL we only need to do this :

GSL> (defparameter *rng* (make-random-number-generator +mt19937+))
*RNG*

The variable *rng* now holds our generator. Anyway, this just creates our generator with the default seed. The default seed is contained in the constant +default-seed+ and defaults to the value of zero. In the previous example, the call might have been written like (make-random-number-generator +mt19937+ +default-seed+). If we want to seed the generator with a different seed value, we only need to add the seed, e.g., 1024, since it is an optional argument:

GSL> (defparameter *rng* (make-random-number-generator +mt19937+ 1024))
*RNG*

Contrary to the standard of Common Lisp, this allows to seed the generators with known seeds in the same way as in other languages. This could be useful if you want to compare and reproduce results with programs coded in languages other than Lisp. Having said that, we need to keep in mind the randomness issue of a user-specified seed.

We can now create pseudo-random generators but we don’t know how to produce the numbers. The basic function is get-random-number which receives as an argument the generator to be used and returns an integer. Although simple to use, the function returns integers from the interval [min,max] which is generator dependent. If we don’t know these bounds we can use the auxiliary functions rng-min and rng-max to obtain them. Using the MT generator with the default seed:

GSL> (rng-min *rng*)
0
GSL> (rng-max *rng*)
4294967295
GSL> (get-random-number *rng*)
4293858116
GSL> (get-random-number *rng*)
699692587
GSL> (get-random-number *rng*)
1213834231
GSL> (get-random-number *rng*)
4068197670

We can’t specify an upper bound like in the standard random function but it’s easy to make a function using get-random-number where we can set the limits from which we want pseudo-random numbers. Lucky for us, there is another way to produce our numbers by using the function sample. The major difference from get-random-number is that sample can produce floating point numbers and integers. In the first case, sample can generate numbers between 0.0 and 1.0 (exclusive). There is also the possibility to include or not 0.0. In the second case, sample generates integers between 0 and an upper bound given by a parameter (:upperbound). With this in mind, to decide how we want sample to behave the control flag must be one of the following: uniform, uniform>0 and uniform-fixnum. Examples:

Generating numbers between [0.0, 1.0):

GSL> (sample *rng* 'uniform)
0.23165654274635017d0
GSL> (sample *rng* 'uniform)
0.48497361433692276d0

Generating numbers between (0.0, 1.0):

GSL> (sample *rng* 'uniform>0)
0.9574769565369934d0
GSL> (sample *rng* 'uniform>0)
0.7443053431343287d0

Generating integers between 0 (inclusive ) and 100 (exclusive):

GSL> (sample *rng* 'uniform-fixnum :upperbound 100)
54
GSL> (sample *rng* 'uniform-fixnum :upperbound 100)
73

These are the main operations with random number generators using GSLL. As seen, the main advantage is to be able to create different types of generators and having an easier way to control the seed. There are other functions that operate on generators but less interesting. We have two functions that copy the state of a given generator: copy-to-destination and copy-making-destination. In short, these functions make an exact copy of a generator. The first function copies a generator into a given destination with the same type of the source. The second function does not need a destination. Examples:

GSL> (defparameter *rng2* (make-random-number-generator +mt19937+ 1024))
*RNG2*
GSL> (copy-to-destination *rng2* *rng*)
; No value
GSL> (get-random-number *rng*)
2781812667
GSL> (get-random-number *rng2*)
2781812667
GSL> (defparameter *rng3* (copy-making-destination *rng*))
*RNG3*
GSL> (get-random-number *rng2*)
1825252241
GSL> (get-random-number *rng3*)
1825252241

The other remaining functions are name, rng-state, size and get-random-state. The first function returns a string with the name of the generator, rng-state a pointer to the generator state, size the size of the pointer and get-random-state return the complete state of a given generator specified as a vector of bytes. And this is it!

Before concluding this overview of using pseudo-random numbers generators in Lisp using GSLL, just a few thoughts. Using this library seems a good option especially if generating random numbers is important in our programs. The reasons are evident: several generators to use, seed control, and library maturity (GSL). I have no idea of how GSLL affects performance but that is something to be explored and studied.

Written by Jorge Tavares

September 5, 2009 at 20:02

Exploring pseudo-random numbers in Lisp

with one comment

For the past few days I have been exploring a little bit more the generation of pseudo-random numbers in Lisp. In the past I have done a lot of programming where this was something not to overlook (e.g., evolutionary algorithms). However, since it was mostly stuff to be used by me, I didn’t spend time in trying to find ways to make the random generator component more portable or easy to use. The standard Common Lisp functions on this aspect were more than enough. But I must say they never let me completely happy. The main reason I guess was that I couldn’t provide a seed, just like the other “popular languages” used at the lab, by friends and colleagues. Since now I am decided to implement a more general library for EC and related stuff, I decided to explore other options to use. And at the same time, to explore more of the Lisp world, mainly in terms of libraries.

But before that, let just review what Common Lisp has to offer us. If we don’t need anything very fancy or that requires special care, the standard functions and variables are more than enough. To generate a pseudo-random number we just need to call random with a limit. This limit can be a positive integer or a floating-point number:

CL-USER> (random 100)
78
CL-USER> (random 1.0)
0.24247074

The result is always a number of the same type between zero (inclusive) and the limit (exclusive). The resulting numbers follow approximately an uniform distribution. For example, if the limit is an integer, the possible results follow a 1/limit probability. The only other thing to know about random is that it has an optional argument. We could think it might be a way to set the seed for the random number generator but it is not quite that. The optional argument in random is a state, an object of type random-state. Within these objects we can find the information that maintains the state of the pseudo-number random generator, since it encodes its internal state. The default current state is kept in the global variable *random-state*. Each time random is called without an explicit state (like in the previous examples) it produces a side effect that changes *random-state*.

When we wish to control the generation process, we can proceed in different ways. For instance, it’s possible to make a copy of the default state and use it later to generate the same sequence of numbers. Other ways might include the creation of several objects of type random-state and use them to generate the numbers. In order to make copies or create new state objects we use the function make-random-state. This function has one optional argument. If the argument is an object of type random-state, it returns a fresh copy of that state. If it’s nil, it returns a copy of the current random state. Finally, if it’s t the it returns a new object.

To copy a random-state object, e.g., *random-state*:

CL-USER> (let ((state-copy (make-random-state nil)))
           (equalp state-copy *random-state*))
T

Some basic usage of make-random-state:

CL-USER> (let ((new-state (make-random-state t)))
           (= (random 100 new-state) (random 100)))
NIL
CL-USER> (let ((current (make-random-state nil)))
           (= (random 100 current) (random 100)))
T

The important aspect to notice here is that, when we call make-random-state with t, the new state is initialized randomly in some way, such as the time clock, process id, etc. As I mentioned, this is the part that I never quite like it since we cannot provide a seed number. As explained in Common Lisp The Language, 2nd Edition:

The reason for this is that the number of bits of state information in a random-state object may vary widely from one implementation to another, and there is no simple way to guarantee that any user-specified seed value will be “random enough.”

Because of this, the initialization of a new random-state object is left to the implementation. So, as previously said, the only way to execute a program, or function, that uses random many times in a reproducible way is to explicit control these random-state objects. For example, if we want to generate the same sequence of numbers we can do the following:

CL-USER> (let* ((state (make-random-state t))
		(copy  (make-random-state state)))
	   (list
	    (loop repeat 10 collect (random 100 state))
	    (loop repeat 10 collect (random 100 copy))))
((26 97 90 80 29 61 30 85 56 19) (26 97 90 80 29 61 30 85 56 19))

If in the second loop the call to random would use the same state or the default one, the generated sequence would be different. The not so great aspect of this process is when we wish to execute a program several times. In short, this implies that we need to save the random-state in a file using print. Each time the program is run, we read it, create a copy of the random-state object from the printed representation and initialize the generator with it.

Saving the random-state:

CL-USER> (with-open-file (stream "random-state.txt" :direction :output :if-exists :supersede)
	   (let ((state (make-random-state t)))
	     (print state stream)
	     (loop repeat 10 collect (random 100 state))))
(47 83 52 6 54 47 52 4 75 76)

Loading the saved state to replicate results:

CL-USER> (with-open-file (stream "random-state.txt" :direction :input)
	   (let ((state (read stream)))
	     (loop repeat 10 collect (random 100 state))))
(47 83 52 6 54 47 52 4 75 76)

Let’s imagine we want to run several runs, for example 30, of an evolutionary algorithm. To be able to replicate those 30 runs we need to create and save 30 random-state objects, use each one for a single run and every time we need to replicate the results we just load the 30 random-state objects. Simple as that!

The other important factor is the generator itself, mainly, what algorithm is used to generate the pseudo-random numbers. And that also implementation dependent. Most of the standard generators implemented are not great for the purposes of scientific work and researchers usually look for external libraries or implementations of good generators. Especially if we are talking about languages which are not Common Lisp. A little because of that, I must confess I didn’t pay to much attention to what algorithm Lisp implementations use. In EC computing (and probably most other areas) prefer to use the Mersenne Twister (MT) generator because it’s fast and produces good quality pseudo-random numbers. I was aware that CMUCL implemented it but for some strange reason, I assumed that SBCL didn’t (which is a weird assumption I must say). Anyway, I was glad to confirm that SBCL also uses MT! One of the reasons I started to look at GSL/GSLL was precisely because of the generators. GSL/GSLL has lots of generators to choose from and that is by itself a very good thing but it will be interesting to compare their MT and SBCL’s one. But that’s for another post.

Written by Jorge Tavares

September 2, 2009 at 19:17

Starting with scientific programming in Lisp

leave a comment »

Recently I’ve just clean up my computer which in short means a format and a new software installation. I also try to clean up my files and keep things in order but that part is harder to accomplish. And as a side effect I guess my MacBookPro is more prepared for the Snow Leopard upgrade (still need to purchase the box). Anyway, as part of this process and my research work, I’ve decided to explore more some options in terms of libraries and related stuff. The idea is not only to use full libraries/frameworks related to Evolutionary Computing but mainly, to find good scientific tools that might help me in developing the things I know that don exist. And so, the number one library I decided to take a look was the GNU Scientific Library (GSL).

GSL has a lot of stuff, some of which I will probably never need, but a quick look at the documentation revealed something that interest me very much: random number generators. Since I develop on Mac installing the library was pretty straightforward using MacPorts: port install gsl and it was done! A quick test showed that it was ok. Using the first example from GSL documentation:

#include
#include 

int main (void)
{
    double x = 5.0;
    double y = gsl_sf_bessel_J0 (x);
    printf ("J0(%g) = %.18e\n", x, y);
    return 0;
}


Compiling with:

gcc example.c -o example -lgsl -I/opt/local/include -L/opt/local/lib

and the output was J0(5) = -1.775967713143382642e-01. Since GSL is coded in pure C it’s very easy to include it in Objective-C programs if thats the case.

However, since I am a Lisp guy I needed to find if bindings to Common Lisp were available. And since nowadays you can find almost all you want for Lisp, it was not hard to stumble into GSLL (The GNU Scientific Library for Lisp). The library is very complete and allows an easy manipulation of the library functions, even in a interactive manner.

Installing GSSL on the Mac was a little bit more complicated than expected but thanks to the help of Liam Healy, it is now running fine. The main problem was due to the fact that the headers file of libraries on Mac OS are not always on standard places as other unixes systems and so, things might not run at the first time. And in my case the headers were in a different place than what it was assumed. But now if you install GSLL by using clbuild, most likely there won be problems. In essence, it’s just necessary to follow the instructions on the GSLL website in how to install it using clbuild and then:

CL-USER> (asdf:operate 'asdf:load-op :gsll)

To see a list of examples:

CL-USER> (gsll:examples)

And since I am interested in random number generators:

CL-USER> (gsll:examples 'random-number-generator)
((LET ((RNG (MAKE-RANDOM-NUMBER-GENERATOR +MT19937+ 0)))
   (LOOP FOR I FROM 0 TO 10
         COLLECT (SAMPLE RNG 'UNIFORM-FIXNUM :UPPERBOUND 1000)))
 (LET ((RNG (MAKE-RANDOM-NUMBER-GENERATOR *CMRG* 0)))
   (LOOP FOR I FROM 0 TO 10
         COLLECT (SAMPLE RNG 'UNIFORM))))

A few days ago I’ve started to code my own EC library in Lisp and this will be very useful since I will be able to use and control the random number generation process. To be able to use the Mersenne Twister generator is a big plus for me! Anyway, the exploration of GSL and GSLL has just begun!

Written by Jorge Tavares

August 30, 2009 at 18:08

Posted in Programming

Tagged with , , ,

A new beginning

leave a comment »

fctuc

This summer I have returned to Coimbra. Although I still had some more time at MIT, an offer was made to me that I couldn’t refuse and as such, I’ve terminated my postdoc to start a new position. The best of it is the freedom to pursue my interests and define my own line of research. In spite of this, I am still collaborating in some projects with MIT.

Written by Jorge Tavares

August 4, 2009 at 22:04

Posted in Blog

Tagged with

A bit quiet and recent changes

with one comment

My online presence, specifically this blog/homepage has been quiet for the past several months. This was mostly due to my postdoc at MIT and also a little laziness I must add. Anyway, I am slowly changing this. I am making this my main site on the web and started my cleaning and organizing a little bit this blog and how to coordinate the stuff from the other places where I have a presence, like the tumblr or the books blog. There’s still a lot to do but it will fall into place one of these days. I guess a big-bang style update would be the best but I don’t think that’s going to happen. The big news is that I have my own domain now! Finally http://jorgetavares.com became free and so it’s pointing now here! As well as some of the others more famous domain terminations. One more reason to give more life to this place!

Written by Jorge Tavares

June 22, 2009 at 0:38

Posted in Blog

Tagged with

Game over for Human Evolution?

leave a comment »

I’ve just read a news on Times online that reports a talk given by Steve Jones at the University College of London (UCL). Quoting the news:

Speaking today at a UCL lecture entitled “Human evolution is over” Professor Jones will argue that there were three components to evolution – natural selection, mutation and random change. “Quite unexpectedly, we have dropped the human mutation rate because of a change in reproductive patterns,” Professor Jones told The Times.

The rest of the news gives some more details, however, I don’t completely agree with the reasoning exposed. I would be more favourable to a slowing of the process, a change of direction or something like that. Evolution is a dynamic process and stating it is on halt or simply over is very reductive. Different variables come into play in different times and stages. Anyway, it’s an interesting news and topic for thought.

Written by Jorge Tavares

October 9, 2008 at 10:16

Posted in Science

Tagged with ,

Back from PPSN X

with 3 comments

Just got back from PPSN 2008 and I enjoyed it very much. It was my first time at this conference and I believe the poster only format, absence of parallel sessions and a summary presentation of the papers by the session chair works very well. You are given a quick overview of the papers, having a bit more of information on what you want to see and afterwards, the interactions between attendants and presenters is much stronger. Of course some posters are flooded with people whilst others not but that’s not important. I belive the conversations between researchers are always interesting.

Although there is a bias towards theory papers and Evolutionary Strategies, you have a good diversity of approaches and type of papers, from applications to new techniques for example. For me it is hard to pinpoint a best work from all that I’ve seen but I was impressed with Peter Merz’s new approach to very large TSP problems. 

Finally, I guess the organization and the venue of the conference were very good. No complaints here. I wish I can attend in 2010 too.

Written by Jorge Tavares

September 19, 2008 at 15:55

Bio-inspired Algorithms for the Vehicle Routing Problem

leave a comment »

And it seems our book is almost out! Springer already put it on its website and it’s announced in Amazon.com and UK. This book is part of the Studies in Computational Intelligence series and has some of the most recent advances of evolutionary approaches to VRP variants. I can hardly wait to see it on paper! :-)

Written by Jorge Tavares

August 14, 2008 at 19:28

CiteULike

leave a comment »

I’ve decided to take a more serious look at CiteULike as a way to keep a list of my publications and, most important, to start a reference library. Why? Well, first it’s easy to keep your publications and have some additional services attached: tags, comments, easy export, links, etc. Second, having a livrary of articles that we like, want to read, etc, is useful in the sense that you can keep a record of your research literature. Moreover, if you start a group within your research group and/or collaborators, it might be a useful tool. And then, since it’s a social network site you could find interesting references that you probably wouldn’t see them (or maybe not). For now, I just added some of my own publications but I intend to explore and use more this site.

Written by Jorge Tavares

July 16, 2008 at 17:20