Zio Tom's blog

2010-12-20

The Scheme of things/2

In one of my previous posts I compared Clojure with Python by implementing two identical versions of a program to solve problem 34 from Project Euler in both languages. My results showed that Clojure was able to be three times faster than Python when type hinting (TH) was enabled (even without TH Clojure was however faster than Python). My idea was then that any type-hinted (semi-)interpreted language like Clojure would have reached the same speed.

In the last month I have run some other tests, and I must say that my idea has changed. A number of Scheme/Common LISP implementation have proven to be even faster than Clojure, and this without type hints!

I am not going to provide my Common LISP and Scheme programs, as they are trivial conversions of the Clojure program I presented before. These are the compilers/interpreters I used to write and run my test programs:


Name	Version	Language	Timing
Ikarus	0.4 RC1	Scheme	14
Larceny	0.97	Scheme	15
Bigloo	3.5a	Scheme	16
Steel Bank CL	1.0.44	Common LISP	17
MZScheme (DrRacket)	5.0.1	Scheme	17
STklos	1.0	Scheme	43
CMU CL	20b	Common LISP	47
Clozure CL	1.5	Common LISP	62
Chicken	4.6.3	Scheme	93
Chicken	4.4.0	Scheme	102
CLISP	2.49	Common LISP	106
CPython (fast)	2.6.6	Python	135
CPython	2.6.6	Python	145
ECL	10.4.1	Common LISP	169

The "CPython (fast)" entry is a rewrite of my original Python program so that the body of solution does not call other Python functions:

sum ([num for num in xrange (10, max_n)
      if num == sum ([FAST_FACT[digit]
                      for digit in [int(x)
                                    for x in list (str (num))]])])

A few notes about these results:

There is more than one order of magnitude between the slowest implementation (ECL) and the fastest (Ikarus Scheme);
Compilers without TH (e.g. Ikarus, Larceny, Bigloo, SBCL) run much faster than Clojure with TH. I was really surprised!
Avoiding function calls make CPython faster, but not much faster;
The fastest program runs using Ikarus, a compiler in alpha stage that has been put on hiatus after the author (apparently) changed his job. What a pity! (Note however that Marco Maggi has forked Ikarus into Vicare, which seems to be actively developed.)
While I had to change my Scheme program in order to make it runnable under various Scheme compilers/interpreters (nothing sensational, just one or two lines at the beginning), my Common LISP program ran unchanged under SBCL, ECL and CCL. One point for portability to Common LISP!

Here is a barchart of the results (click to zoom):

Update: After a suggestion by Water I included in my benchmarks also CLISP (and corrected the name) and CMU Lisp (which was mentioned in one of the pages at CLISP's site as being faster than CLISP – and indeed it is). In both cases I first compiled the program and then timed the execution of the bytecode. I have also added MZScheme, which I left out in the first version of this post. The table and the chart have been updated accordingly.

2010-11-18

The Scheme of Things/1

In one of my previous posts I anticipated that I was going to study some functional programming languages in order to better understand how to write parallel programs for multi-core and multi-CPU architectures. During my (very slow!) quest I re-discovered a old language I used to be quite fond of: Scheme, a LISP-like language which dates back to the 1970s. In this post and the following I shall provide some information about my relationship with this wonderful small language, as well as some insights about how it might be useful for scientific computations.

Music Typesetting and Scheme

My first experience with Scheme dates back to about ten years ago. I am a passionate music lover, and I was looking for a good program to typeset music. I found the marvellous GNU Lilypond, which was well-suited for what I wanted to do at the time. I contributed a lot of pieces to the Mutopia project (mostly scores by Mozart and other classical composers, see e.g. Mozart's String Quartets from Op. 10), and at the same time I became quite good in extending Lilypond using its scripting language. Guess which language it was? Of course Scheme! Lilypond used GNU Guile, a Scheme interpreter sponsored by the Free Software Foundation (yes, the guys behind Gnu Emacs and GCC). The intention of the Guile coders was to develop an interpreter that could be used by other projects to provide a "plugin" language usable for extending large applications. (Unfortunately, the project did not suceed in gaining enough popularity and it has developed at a quite slow pace: just have a look at these dates and version numbers.)

Partial Differential Equations and Scheme

During 2001-2002 I worked on my master thesis. The main objective of the work was to study the solution of the heat equation given some well-defined boundary conditions, in order to better understand some properties of the reference loads of the Planck/LFI instrument. One of the tools I developed for my study was a small C program which used the Crank-Nicholson implicit method to solve the one-dimensional heat equation numerically.

Since the objective of the program was to simulate how the temperature of a body changes with time, the output was a quite large set of numbers: for each time step, there were tens of temperatures (sampled on a regular grid) that the program printed. According to the type of analysis I was doing, I had to throw away irrelevent temperatures and keep the ones I was interested: at the beginning I was using GNU Awk for the job, but this was not the optimal solution. At some time I realized that what I needed was some way to attach "plugins" to my program: each plugin would have been a small program which would have driven the program and instructed it to print only the information needed by the ongoing analysis.

This of course called for an embedded interpreter. At the time the one I knew best was of course Guile (see above), so I decided to implement Guile bindings in my program. All worked like a charm: once the C program was properly debugged and produce reasonable results, I could start writing Scheme snippets to perform every analysis I wanted and attach them to the program without the need to recompile it. It was like a dream! (Actually, one of the appendices of my thesis contains the source code for some of those Scheme plugins.)

Thermal Modeling and Scheme

After having defended my thesis, I continued studying the thermal performances of Planck. We used a number of closed-source thermal modeling tools based on Fortran 77 for doing the most complex analysis. Such tools required the user to create huge text files (called model files) containing details about how the object under study was to be discretized and what were its characteristics (e.g. density, specific heat…) at each point. Such files were interwoven with Fortran 77 code which calculated and extracted the information required for the analysis (much like Scheme plugins did for my program).

Managing such files was a pain, so I created a couple of programs to help me in the task. The first program, Thermal Scheme, looked for a specific signature in the contents of model files: every text between #{ and #} was interpreted as Scheme code, which was run and its result replaced the Scheme code in the model file. Therefore, giving a text file like

Hello #{(display "world")#}!

Thermal Scheme would translate it into

Hello world!

Of course, Thermal Scheme provided a number of Scheme functions to easily define and discretize complex structures.

Thermal Scheme was able to act like a "data provider": when some command-line flag was used, the program recorded every Scheme definition and built a text file containing a 3-D representation of the object. A companion program, Thermal Viewer, was able to connect to "Thermal Scheme", get this text file and create an interactive 3-D view of the object using OpenGL.

Things Go Awry!

After a few months working with Thermal Scheme, I was supposed to move to the Rutherford Appleton Laboratory (Oxford, UK) to continue working on Planck thermal analyses. I even arranged a demo of the capabilities of Thermal Scheme! But then in Spring 2003 the Italian Government obliged me to do ten months of civilian service, so I was forced to decline that job offer. Ten months after (May 2004) my duties were over and I got a temporary position at the National Institute of Astrophysics, where my first duty was to build a medium-sized software tool for Planck/LFI using RSI IDL (a Fortran-like language for data analysis). So I started studying IDL and completely forgot about Scheme… till a few weeks ago!

In one of my next posts I shall tell you how Scheme evolved in the meantimes, and what I discovered about the performances of some of the currently developed Scheme interpreters/compilers. Stay tuned!

2010-11-08

Authenticating NIS users with PAM - Quirks

Last friday I had a large headache in trying to understand why the account of my friend Simona was not allowing her to authenticate on the computers of our institute. The rule here is to manage all the authentications using a NIS server, and keeping the home directories distributed among a number of servers. It seemed that Simona was not able to authenticate from any of the computers used by our team, but the sysadmin had no clues why this was happening.

The error messages written by our Debian machine (in /var/log/auth.log) clearly indicated that the computer was able to recognize Simona as one of its users, but it did not accepted her password. To understand why this was happening, I had to browse the documentation (and the source code!) of PAM. A really interesting read, but this did not explain why every user (including myself) but Simona was able to authenticate.

After hours of fruitless attempts, I realized there was something wrong when I used ypmatch passwd to compare Simona's key in the NIS map with mine: her encrypted password was much longer than mine. This hinted the nature of the problem: a mismatch in the authentication mechanism.

Remote authentication works by encripting password attempts and comparing them with the encripted password memorized by the server. (The comparison is not done on the passwords directly as they need to travel through the web, making this process insecure.) Of course, for the comparison to be effective the encription algorithm must be the same between the attempt and the password stored in the server!

Some time ago the sysadmin upgraded the operating system of the NIS server, and this apparently changed the default encription algorithm for new users to be Blowfish instead of MD5. (Note that old users like me kept the old MD5 encrypted password.) But our computers (running Debian Lenny 5.0) implemented the plain PAM module, which is only able to use MD5. Therefore, the comparison was made between the MD5-encrypted password entered by Simona (and encripted by our computer) and the Blowfish-encrypted password stored in the NIS server: this obviously failed! This also explained why I was still able to authenticate even after the sysadmin's upgrade.

The solution was to install the Debian package libpam_unix2, which provides a PAM module which recognizes both MD5 and Blowfish, and replace every occurrency of pam_unix.so with pam_unix2.so in the PAM configuration files (in /etc/pam.d/common-*). Phew!

2010-11-02

Clojure and FP - First impressions

This blog entry is to report my achievements with Clojure and Functional Programming (see my previous post).

What I do not like

The first thing that one faces when trying to use Clojure is the difficulty in installing it. The downloads page only lists a few zip files, without saying anything about how to use them. (Sure, there is a Readme file, but I would have like to know something more before downloading anything.) In comparison, the Downloads page of Scala is much better in shape, and a number of pre-built packages are provided. Moreover, Scala is included in Macports, while Clojure is not. Once the ZIP file has been uncompressed, the Readme file is extremely laconic: it just reports the command required to run/build Clojure. Since the REPL has no editing facilities (i.e. you cannot use Backspace/delete to modify what you have written, and there is no history), I would have expected some reference to jLine or rlwrap as these can really save you the day.

In general, Clojure shows its young age in a number of other spots. The documentation is quite sparse, and you often have to look at the source code of some function in order to understand its behavior. Many tools I regularly use for source code (e.g. Vim, Source-highlight and others) do not have native support for Clojure (although you can usually find something in the Web - I find VimClojure one of the best Vim plugins I have ever seen, the "rainbow parentheses" mode is incredibly useful!).

What I like

Being a LISP dialect, Clojure can be extremely elegant. Here is some code I wrote to solve problem 34 from Project Euler.

A few things worth to note:

Clojure implements a large number of high-level data structures, like dictionaries. I used the latters in fast-fact, a dictionary whose keys are the 10 digits and whose values are the corresponding factorials.
It is really easy to call Java functions like toString: just prepend them with a dot. No need to import modules to do this!
Although Clojure is a dynamically typed language (like e.g. Python), it allows to specify the type of the input parameters for functions and let bindings. I have verified that this allows to dramatically increase the speed of the code (3x or 4x times).
Like Python, Clojure allows to use docstrings. They can be accessed from the REPL through the doc function.

Python vs. Clojure

Since I am an avid Python user, I immediately rewrote the program in Python:

I tried to follow the same logic used in the Clojure program, i.e. using a dictionary for the first 10 factorials and getting the list of digits for a number by first converting it to a string (an alternative would be to repeatedly apply division and modulus 10). Then, using the Clojure time function and the IPython %time command, I was able to record the times used by each program. In order to get sounder values, I repeated each measurement till the timing did not change significatively any more.

Results are shown in the plot on the right. They look quite impressive: as shown by the following graph, type hints allow Clojure to be more than three times faster than Python. Even not using type hints would have allowed Clojure to be approximately 20% faster than Python.

Preliminary conclusions

Despite its young age and a few rough edges, my overall impression is extremely positive: Clojure is really nice to use and the number of features it provides are extremely interesting. Its speed is also remarkable.

2010-10-19

Functional programming

Recently I had to work with an old numerical software written in Fortran77. Such program simulates the microwave radiation pattern of an emitter (like an horn or a waveguide). The program requires several minutes to compute a radiation pattern for all but the simplest configurations, so I started wondering how the computation might be accelerated (despite the fact that, being the software closed-source, the only way to go would be to rewrite it from scratch).

One way to get a significative speedup is to take advantage of multi-core architectures, as this seems so far the only way to ensure that performances will scale in the future. As it is well known, parallelizing an algorithm is not generally a trivial task. Rather, it often involves rethinking the problem from the ground up.

As parallelizing C/C++/Fortran code requires using either OpenMP (easy, but not feasible for complex tasks) or MPI (many features, but difficult to use), I started investigating some "new" approaches to parallelization. Then I found… functional programming!

Functional programming (opposed to imperative programming) is a way to write programs that concentrates on functions which work on immutable values. The point of having immutable values helps parallelization, as one does not need locks, mutexes and semaphores: this explains why, even if functional programming is a quite old concept, it has gained a renewed interest in the last few years. Microsoft is pushing its F# language (based on the very nice family of Caml languages), and other languages have been designed with multicore CPUs in mind. E.g. this post compares Clojure (see below) with Python with an explicit note on concurrency/parallelism (something Python is not very good at, see also the follow-up).

In the last weeks I started gathering some information about the most widely used functional languages. I started with OCaml, as it seems one of the oldest and widely-used languages. Although the INRIA compiler produces really fast code, it seems that some fundamental problems with the garbage collector prevent OCaml programs from taking benefit of multicore CPUs. Although there are attempts to solve this, OCaml does not seem ready for multicore yet. So I looked for another language to study. Haskell enjoys a very large community, but I do not really like it (but it seems to support concurrency pretty well…), and I avoided Erlang too (no specific reason, maybe it is its syntax that scares me).

Among the most interesting functional languages of the new generation, Scala and Clojure have caught my attention. Both run using the Java Virtual Machine (which, according to your taste, might or not be an advantage) and have a similar objective (exploiting functional programming for easy parallelization, see e.g. this page about Clojure), although they try to accomplish this using two different approaches: Scala is clearly derived from Java and ML, while Clojure is a simpler (and nicer) LISP. Both are moving targets (both languages seem to change pretty fast with each release) but already have a remarkable community.

I am studying Clojure with great interest, as it reminds me of my old days with Scheme. Although it seems that Scala often beats Clojure in performance (see The Computer Language Benchmarks Game, but take their results with a grain of salt), I find the syntax of the latter easier to understand. More details to come with the next posts!

2009-06-02

C code for determining the operating system

In the last days I have been busy trying to making our analysis program (written in a strange mixture of C, C++ and ITT IDL) work under Windows, Linux, Mac OS X and FreeBSD. I have discovered that a severe bottleneck was represented by the data transmission through a local socket. Under Linux and FreeBSD, sending ~500 packets of a few bytes each was taking ages to complete, while under Mac OS X the same task completed in a fraction of second. This is really strange, considering that we were using local sockets, i.e. communications between two tasks on the same machine.

After a little digging with Google, I discovered that the standard algorithm used by Linux and FreeBSD to send data through a socket can sometimes induce severe slowdowns. Luckly, there is a simple solution: call setsockopt and set TCP_NODELAY to 1. Doing this on our code made a performance boost of several orders of magnitude.

Since I wanted not to change things on those operating system that have never raised troubles (Mac OS X), I decided to put the call to setsockopt within an #ifdef…#endif block. But what macros must be checked to determine if we are compiling under Windows, Linux, FreeBSD or Mac OS X? It turned out that this information is not easy to find on the internet.

By Googling and using the clever trick suggested in this Wikipedia page (i.e. call gcc -dM -E - < /dev/null to discover which macros are automatically defined by the preprocessor), I have been able to find the following macros:

FreeBSD defines __FreeBSD__ (at least FreeBSD 7.x);
Under Linux you must check either __linux or __linux__;
Windows defines _WIN32. (this was easy, as it is clearly specified in MSDN).

I have had no need so far to find this information for Mac OS X. I suppose that the macro to use is something like __darwin,__Darwin__ or whatever.

Update: (2011/05/17) Predef has nice tables containing the preprocessor macros defined by many compilers and operating systems. Very useful!

2009-05-11

Remote "tar" backups via SSH

My first "real" entry to this blog is an easy method I have found this morning to perform a remote "tar" backup using SSH. It should be general enough to be applicable under any Unix system (I have tested it on Mac OS X 10.5 and Debian Lenny).

The problem I had was the following:

I had a lot of data (~ 33 Gb) on computer "foo" which wanted to back up before formatting the hard disk and installing FreeBSD 7.2;
There was however not enough room in computer "foo" to create the .tar file (just 13 Gb);
Thanks to a colleague, I had access to another computer "bar" with lot of free space (~750 Gb).

I solved this problem with the following command:

tar cvz -f - ./ | ssh bar "cat > \
backup.tar.gz"

The "tar" process will write the .tar.gz file containing data from the "./" directory into the pipe. The "ssh" sends whatever is written into the pipe to the "bar" computer, where a "cat" command will save the data into "backup.tar.gz". This results in the .tar.gz file being written directly on the remote computer.