In the past months I had the chance to have a deeper look at Haskell and Erlang (in a previous post about functional languages I dismissed them quickly). I have realized that Erlang is the best language when you need to run thousands of lightweight processes (e.g. serving web requests) but has little to offer to scientists. On the other side, Haskell has a lot of potential, as I found by porting my solution of Project Euler’s problem 34.
Let’s first recall my original purpose. I wanted to test how faster Scheme/LISP/Clojure were than Python (at the time my language of choice for quick development). I implemented a number of programs which solved problem 34 (finding the sum of all the numbers which are equal to the sum of the factorials of their digits, like 145 = 1! + 4! + 5!) and measured their performance. I found with great amazement that in some cases LISP-like languages were one order of magnitude faster than Python (14 seconds instead of 140).
This summer I decided to do the same test with OCaml and the Glasgow Haskell Compiler (GHC). First, here is my OCaml program:
The fast_fact
variable is an array of 10 elements containing the
factorials of all the digits. (Note that this implementation differs
slightly from my LISP solutions, where I used a dictionary.) The
digits
function returns a list of the digits in a number. Unlike my
LISP programs (where I converted the number into a string and then
converted its characters back to numbers), here I chose to extract the
digits by using a division by 10 in a loop: this should be faster, as
I do no longer need to allocate space for a string. The test_number
function determines if num
satisfies the requirement of problem 34.
The --
function returns a list containing the numbers from i
to
j
.
OCaml programs can be run either using the interpreter or compiled
using the optimizing compiler. I chose the latter and timed the
running time of the executable using the Unix time
program. On my
computer the program takes roughly 3 seconds to execute, which makes
it roughly 4 times faster than the fastest LISP solution I wrote (a
Scheme program compiled with the now-discontinued
Ikarus Scheme).
By navigating in the Project Euler’s website I found that one of the other users (“ix”) posted a Haskell solution that is incredibly compact:
I was struck in amazement by the compactness of the code. But the
greatest surprise came when I used GHC to compile it and measured the
running time: less than 0.2 seconds! These two lines of code are 4
orders of magnitude faster than my 24 lines of Python code. Four.
Orders. Of. Magnitude. That is unbelievable. And note that the code
does not pre-calculate the factorials for each digit (as I did in the
OCaml program, see the definition of fast_fact
).
The fact that the Haskell solution is one order of magnitude faster
than OCaml is probably due to the use of --
in problem34.ml
: this
means that OCaml needs to build a list containing all the numbers from
10 to 10,000,000 before applying test_number
. It is possible to
overcome this limitation by building an explicit cycle, but I wanted
something as much compact as possible. (The code would have been more
compact had OCaml provided something like the Haskell syntax ..
instead of making me define the --
operator.) On the contrary
Haskell, thanks to its
lazyness,
never builds the list [10..10000000]
in memory. (If you know Python,
it is like if OCaml were using range
while Haskell used xrange
.)
Considering that debugging one line of (Haskell) code is considerably easier than debugging 24 lines of (Python) code, I think I shall use Haskell more and more in my next mathematical/scientific projects.
P.S.: This time I used markdown instead of Org-mode to write this post. It produces less bloated HTML and it seems to be enough for posts of this kind. I also used gist to include syntax-highlighted snippets of code, which is much better than the code produced by Org-mode.
No comments:
Post a Comment