Posts Tagged ‘Lisp’
Sorting algorithms used in the CL implementations
Which sorting algorithm should one implement when developing a program? The best answer is probably none. Use the sort provided by your system/library/etc. Unless you know your input data has some special properties that you can take advantage of, the provided sort should be enough for your needs and probably is more efficiently implemented.
However, I think it is important to know what sorting algorithm is implemented. If one knows the properties of the data, it is possible to understand if the provided sort can or will pose a problem. In the same way a programmer shouldn’t implement a sorting algorithm every time it needs to sort something, the programmer should also be aware of the limitations/advantages of the system sort. That way one can decide if a special sort is needed or not.
Common Lisp provides the functions sort and stable-sort. The HyperSpec describes their operation well but it does not define the sorting algorithm. That decision is left free to the implementations. In addition, both functions don’t necessarily share the same algorithm. The difference between the two is that the second function sorts in a way that guarantees stability, i.e., two elements that are equal remain in the same position after sorting is completed. The use of sort and stable-sort requires some care (see the section sort pitfalls) but lets focus on the algorithms and not on its usage.
What sorting algorithms do the major open source CL implementations actually implement? I was curious about it and went to check the source for ABCL, CCL, CLISP, CMUCL, ECL and SBCL. Not surprising, we find some differences between the implementations. What it was more unexpected to discover is that some implementations also use different sorting algorithms according to the sequence type. A quick survey of the findings is summarized in the following table (if anythings is incorrect, please tell me). The links for the source code are in the implementation name (careful, in CCL and SBCL there are two links).
| Implementation | sort | stable-sort |
|---|---|---|
| ABCL | merge sort (lists) / quicksort | merge sort |
| CCL | merge sort (lists) / quicksort | merge sort |
| CLISP | tree sort | tree sort |
| CMUCL | heapsort | merge sort |
| ECL | merge sort (lists) / quicksort | quicksort (strings + bit vectors) / merge sort |
| SBCL | merge sort (lists) / heapsort | merge sort |
In terms of the implementation of sort, quicksort is the most used algorithm, followed by heapsort. The choice for these algorithms is expected. Both have an average-case performance of O(nlgn) and heapsort guarantees a worst-case performace of O(nlgn) too. Quicksort has a worst-case performance of O(n2) but it can be optimized in several ways so that it also gives an expected worst-case performance of O(nlgn). However, it seems that the quicksort implementations are not completely optimized. In ECL (and ABCL) quicksort implements a partition scheme which deals better with duplicate elements (although is not the three-way partitioning) but it always picks as pivot the first element. CCL chooses the pivot with a median-of-3 method and always sorts the smaller partition to ensure a worst-case stack depth of O(lgn).
As for CLISP, I think it uses a tree sort but I am not entirely sure. The only source file I could find with a sort implementation was sort.d and it looks like it contains an implementation of tree sort with a self-balanced binary tree, which also gives this algorithm an average and worst-case performance of O(nlgn).
As expected, most of the implementations use merge sort to implement stable-sort since it is a stable sort with average and worst-case performance of O(nlgn). Apparently, all implementations are bottom-up merge sorts with the exception of CCL and ECL. Another interesting thing is that merge sort is also used for lists in sort, in most of the implementations. However, I found it surprising to find quicksort in the stable-sort column because it is not a stable algorithm. Since it is only used for strings and bit vectors, it is not really an issue. While reading the source code of the implementations, I realized that ABCL was using quicksort in stable-sort for all non-list sequences. This is a problem that exists in the current 1.0.1 release but I’ve sent a bug report with a quick fix to the maintainers. The next release should have stable-sort fixed.
This exploration of the sorting algorithms used in the open source implementations was very educational and interesting to me. I’ve learned what algorithms are actually used and enjoyed seing how they were implemented. Just spotting the issue in ABCL stable-sort made this review worthwhile. I think there is still room for improvement in some implementations but knowing now the strengths and weaknesses of the sorts in CL is already good enough. On a final note, I just wonder what are the algorithms used in ACL and LW.
ECLM 2011 Notes
This last weekend I was in Amsterdam to attend the European Common Lisp Meeting. This was my third participation in a organized Lisp meeting (after the first ZSLUG in Zurich and the ELS 2011 in Hamburg) and I am happy I’ve decided to go. I was only present at the meeting itself since going to the dinners and city tour would have been way out of my budget. Anyway, the ECLM was a nice venue. I enjoyed most of the talks and still had an opportunity to talk with fellow lispers. I enjoyed talking with Luís Oliveira and meeting Zach Beane.
The first talk was given by Nick Levine and it can be viewed in two parts. In the first one, he talked about his experiences of trying to write a CL book for O’Reilly. It was quite interesting to see how hard can it be to prepare a book, especially for a publisher who was (is?) not very lisp-friendly. The second part was mostly about the community, although presented with a rant on libraries. This is a topic that has been debated several times. Thanks to Quicklisp, the problem now is not installing libraries but finding them and knowing which ones are good. I am not sure if creating another site as suggested would be a good thing since resources are already scarce. Perhaps more thought must be made in how to improve the current ones. CLiki still seems to me the best starting point. Still, Nick Levine talk was good and entertaining. One of the best in the meeting.
The following talks were mostly about companies that use CL as their main programming language. Jack Harper talked about the company he recently started, Secure Outcomes, that produces a unique portable fingerprint scanner. I must say his talk was quite inspiring! He talked about how to get a startup running and the decisions that took him to choose Lisp as the main development language. In addition, he also explained why prefers Lispworks to any other implementation.
Next, it was the talk given by Dave Cooper. I must confess his talk was the weaker of the day mainly because he talked about two different subjects without any connection. He started talking about GDL, the main product from his company, Genworks. I’m sure GDL can be a great thing but I didn’t get much from his talk. About halftime, the talk suddenly changed to the Common Lisp Foundation. This was the interesting part of the talk since he explained the aims of CLF, the people behind it, etc. However, it was not clear how it will distinguish itself from ALU in terms of operation (in terms of purpose, CLF just focus on Common Lisp while ALU in all Lisp dialects) and this was the main concern that was expressed during the questions time. After presenting CLF, and since there was still some time left for the next presenter, he went back to GDL.
Afterwards, it was the turn of Luke Gorrie to present his lisp-hacker startup Teclo Networks. His talk was an expanded/updated version of the one given in Zurich. Still, it was also quite interesting. He started by telling how a group of hackers with a Lisp and/or Erlang background got together to improve the mobile TCP/IP communications. Then, he showed us how TCP badly misbehaves in a mobile network and how their product, Sambal, can give 10% to 27% improvements. Another interesting point of the talk was that CL is used as their main development language. In short, it is used to develop and study all their algorithms. They have a TCP/IP stack fully implemented in CL! Moreover, all their analysis and maintenance tools are also all in CL. However, in the actual product boxes they have reimplemented the algorithms in C. The reason: extreme pragmatism. Luke concluded by hinting that the sales of their product is going very well!
In the afternoon the talks started with Paul Miller from Xanalys. The talk was dedicated to Link Explorer, a windows desktop tool to analyze data. The application is quite impressive and was developed using just CL. Paul also gave us a demonstration of the tool as well as some notes on future development.
The best and most awaited talk, Quicklisp, technically and socially, was given by Zach Beane. The talk focused on several aspects of Quicklisp. Zach started by giving an overview of the famous library problem of CL, the solutions that existed before QL, explaining their advantages and disadvantages. Also, and very important, what people were actually using and what difficulties they were facing. In a survey he did, most CL programmers were installing libraries by hand, including Zach! Then he proceed to how Quicklisp was developed, some technical issues, what is the role of Quicklisp and what is the reception after one year. The talk focused then on the social impact of Quicklisp in the community. One of the things that makes Zach happy it’s the number of emails he gets saying that people are back to using CL and contributing more to the community (i.e., making libraries available) because of QL. Finally, some indications of what is to come. My perception is that the possibility to enable hacking as it was possible with clbuild is one of the most exciting future features for Quicklisp. Zach’s talk was excellent from all points of view!
The last talk of the day was by Hans Hübner. This was my second favorite presentation. Although the topic, code style and conventions, can start some heated discussions, I must say that I agree with almost everything Hans Hübner mentioned. However, like everything, some common sense is always necessary. One of the main points was that lispers should not use constructions which are not part of the standard language when the standard provide options, just because you want to save some typing. It is more important for another programmer to understand faster what is written than forcing him to look for the definition of the unusual constructs. The if*, bind were examples given. Hans also talked abut the 80-column rule, style guides, etc. In the end, it always depends on the project, the people, etc, but code style is important and should not be ignored.
The meeting ended with several lightning talks. The most interesting bits were: Marco Antoniotti announced ELS 2012, to be held in Zadar, Croatia, around April-May; Christophe Rhodes talked again about swankr, a swank and slime for R; the announcement of ABCL 1.0.0 by Erik Huelsmann.
Some words on the organization. Organizing a meeting of this kind is not easy and Edi Weitz and Arthur Lemmens must be congratulated for making a great event. Not all was perfect but everything went smoothly. I wish that it continues to happen in the coming years!
NXT 2.0
Yesterday I finally dedicated some time to assembly my newest toy: a Lego Mindstorm NXT 2.0! I just made the basic drive rover with some sounds. I need now to explore more the capabilities of the NXT. I need to see what other options exist in terms of programming the brick.
As before, the internal language it’s a bit limited. I played with the original Mindstorm a few years ago. Already back then, I used for a few experiments a set of Lisp macros to generate some commands for the rover. There is XS in Lisp but I’m not sure if that is what I want. Something more like lejOS would be nicer (but in Common Lisp). Anyway, regardless of that, a mindstorm is always a cool toy :-)
Lisp meeting in Munich
Yesterday took place the Munich Lisp User Group meeting. The group is a bit irregular in making them, and taken this into account, I think it was successful (there was one planned in November but it didn’t happen; in the meantime, I went to Zürich to attend one ;-)).
Anyway, yesterday there were 2 short talks and a small discussion at the end. Both talks were a bit improvised since the goal was really to have a meeting :-) I started to talk about my hopefully-soon-to-be-released library of pseudo-random numbers generators for Common Lisp, urandom. It was just a small overview of the motivations for doing it, the interface and showing some examples. Afterwards came a very interesting talk, about Lisp in robotics. It was given by Lorenz Mösenlechner, a PhD student at TUM. He showed us some examples of the CRAM, implemented on top of CL, which allows to write flexible and robust robot control programs. He also showed us some videos. Rosie, the pancake-maker was cool :-)
In the end we talked a little about Lisp and exchange some opinions on CL, Scheme, etc. Moreover, everyone was in favor of trying to do more regular meetings. So, the next one should happen in the first week of April (pick your days). Let’s see! :-)
Quicklisp or: How I learned to stop worrying and love installing Common Lisp libraries
Common Lisp is one of those languages that you either love or hate (for any definition). Unfortunately, the later group is rather quite large and the reasons for that dislike go from the simple parenthesis (I still wonder what these little fellas did to annoy so many people) to many other things. One of them is usually known as the library problem.
The library problem can be seen from many different angles: lack of libraries, lack of decent libraries (and really good libraries), easy of install, easy of use, the central repository, the batteries included, etc, etc, etc. I guess if you want, you can always find an argument to complaint about libraries. It really doesn’t take too much effort. And for the past few years, this library problem has been one of the main arguments against the use of Common Lisp by non-lisp programmers. Even by those ones who don’t use the parenthesis argument to dislike Lisp.
There is some truth in this library problem of course. The state is not pretty when compared to other languages. However, it might be not that bad as sometimes is painted but it’s there. The elephant is in the room! However, if you never experienced a library problem in any other language I guess you belong, give it or take it, to two main groups: you use a very specific set of libraries that happen to be very good in that language/environment/etc, or you really don’t use libraries a lot. In the end, I believe it always depends on what you actually need/use or want (e.g., the batteries included). Sometimes you have excellent libraries in language foo but not the equivalent in language bar (e.g., is there a numpy+matplotlib equivalent in Ruby?). And just because everything was done twice in Java, it doesn’t mean they are that great (“Been there, suffered that”). And finally, do you really need to invent a new language just because of this issue? Well, maybe yes, maybe not…
Although I’ve been programming in Common Lisp for quite some time (for the record, I am not a Jedi Lisp Knight), I never felt the need, or was forced, to use so many libraries like others do. Mainly because of the stuff where I use Lisp. But from time to time, you really need a library because you want that one functionality or simply because you wish to try/play with something. When the library you need is self-contained or with a very small dependency number, the library problem is essentially a non-problem. Ok, sort of. At least the installing/using side of the coin. The quality side is a complete different story. But when you require something more complex, things can get tricky. Really tricky. And you might feel tempted to join the dark side and start believing Common Lisp has a real library problem… which is not what you want.
What you want is a simple and easy way to look for libraries, install, update when necessary and, more important, just use them! So far, systems like, for example, clbuild, start nice but then something happens and your environment becomes messy. It can be just my own fault but it happens. And the manual way is a nightmare. In short, most of the time you worry about what to install, how to install, how to keep the system tight and clean, etc, etc. At some point, you start choosing your libraries by the number and type of dependencies they have. Which I am not fully sure if this is a good thing. And what happens when you really want to try a really cool library but it has a complicated library dependency structure? If you have time to spare, you will likely get lots of worries (and this is assuming you will be successful at the end). Otherwise, you will be frustrated for not trying it.
So, how do you solve this library problem? With Quicklisp. Unless you were recently living on the other side of the Lisp universe, Quicklisp is a system designed and implemented by Zach Beane which “aims to make it easy to get started with a rich set of community-developed Common Lisp libraries”. At first sight, it might look as just yet another attempt to solve the Common Lisp library problem, where so many failed and few had some mild success. Although it is in an early development stage and is not widely available/distributed, Zach is distributing it to anyone who wishes to try it. And I am glad I decided to give it a go.
Quicklisp is really what you always wished it existed for installing and managing Common Lisp libraries. You can see some of the details in the screencast but the quick version of Quicklisp’s advantages are: 1) easy to setup (you just need to load a single file, regardless of your lisp implementation); 2) to install a library is just typing one command and, 3) all the dependencies are taken care of. All the other operations are also very simple to execute. No more worries about how to install a library and dependencies. After a while, you even start installing things just for the fun of it! Yes, It’s really that simple.
I feel there will be a time for Common Lisp programmers: before and after Quicklisp. To be honest, I really hope Quicklisp gathers the momentum and adoption in the community. Even now I’ ve just read in the SBCL developers mailing list the proposal to include Quicklisp in SBCL when it’s properly released. Yes, the “half-baked, incomplete, unmaintained and tasteless” libraries problem will not be solved but at least will be “easy to install them” (in Zach’s own words). There is still a lot to be done but I am confident. And finally, we can all stop worrying in how to install a Common Lisp library (or some of them) and use our time to actually do some coding. And may the Force, I mean, Quicklisp be with you! :-)
