After almost one year since my last post, I am going to write something more regarding "exotic" languages. During the last year I was able to investigate the caracteristics of a well-known, underused language: Ada.
I was interested in it because I found two references to it in the realm of astrophysics: the first one is a spectral line synthesis code for magnetic stellar atmospheres, the second one is a list of satellites which are using software (partly) written in Ada. Also, I always heard of how Ada's typesistem is exceptionally safe and was truly interested in giving it a try.
So here is a list of features of Ada that caught my attention:
- It is a statically typed language (like C/C++/Fortran/Haskell, unlike Python/Ruby/Scheme). However, unlike C and (to a lesser degree) C++, its type system is strong. This means that the compiler enforces type correctness and will not silently convert e.g. floats to integers.
- Ada code looks verbose, but very readable. (As I understand, this was one of the driving requirements in the development of the language.)
- It has a number of interesting features over C and C++, like keyword
arguments (called
named parameters),
nested functions, and some primitive type-inference (e.g. you do not
have to declare the type of a variable used in a
for
loop). - It allows code to be split into packages (more on this later).
- Ada's versatile typesystem allows the programmer to make the
compiler doing dimensionality checks. So, e.g., it will signal an
inconsistency for code like
if obj_speed < field_size
, ifobj_speed
andfield_size
have been properly declared. Apparently, this feature was considerably extended with the latest version of the language (Ada 2012). - It has native support for multitasking. (As far as I know, Ada, Erlang and Go are the only non-academic languages that were designed from the ground up with this capability.)
- Ada is compiled to machine code: the reference open-source implementation is GNAT, which is part of GCC (GNU compiler collection).
- Being a tightly integrated component of GCC, it is extremely easy to develop bindings to C/C++ libraries (there is even a tool to do this automatically).
- Interestingly enough, GNAT is developed by AdaCore, a commercial company which seems to have a quite large user base. It develops both the open-source compiler and a commercial version.
Is Ada outdated?
Before digging into Ada, I had the idea that it was an outdated language with virtually no users today. But I was wrong on both fronts:
- Ada was born more or less in the same years as C++: work on Ada began in 1976, while Stroustrup's "C with classes" toy language dates back to 1979.
- There are a lot of Ada users. Only, Ada programmers do not seem to work in the contexts I'm used to.
Is Ada verbose?
I do not like verbosity in general. I have always been amazed by the
coinciseness of languages like Haskell: in my opinion, the shorter a
program is, the quicker you're able to find problems in it. And,
undoubtedly, Ada is quite verbose. A lot of grammar constructs are not
strictly necessary for the compiler to understand what the code should
do: e.g. the is
at the end of procedure/function declaration.
However, I must admit that I've found more than once some source code that was so condensed that it is difficult to understand how it worked &edash; or why it was not working. Readability is probably as important as coinciseness. Ada's designers put a lot of effort in making the language easy to read, even to people that have never learned Ada.
Types, types and types
Ada's strongest advantage is probably its versatile typesystem. You
can define "subtypes" which optimally limit the range of values of a
primitive type (e.g. an integer which can hold values between 18 and
28): Ada will automatically add bounds checking code. (If you do not
limit the range, your subtype works like typedef
s in C.)
However, the most interesting feature is the ability to define new
types from primitives. In this case you are not allowed to mix the new
type with its primitive, unless you explicitly tell to compiler to
allow you. So, you can e.g. create three new types distance
, time
,
and speed
from the primitive type float
: you will not be allowed
to combine (e.g. add/multiply) variables of different type, unless you
manually override compiler's checks or redefine operators (like C++).
This allows to check for measure unit's consistency in the code. (Such
a feature is possible in C++, at the expense however of writing a lot
of bolierplate code: basically, you have to define a class which wraps
a float
and implement manually every operation you need on them; on
the contrary, Ada already knows how to sum two distance
variables,
as they behave the same as floats
.)
Packages
I am not happy about how C/C++ programs are split into multiple files.
You usually separate the classes and functions in different .cpp
files, and in each of them you include a .h
file which provide the
class/function definition. However, the compiler is unaware of the
difference between .h
and .cpp
files: the difference is relevant
only to the C preprocessor.
Compare this with Ada packages. You have to write two files, as in
C++: one with extension .ads
(the specification file, analogous to
.h
files) and .adb
(the implementation file, analogous to .cpp
files). However, the Ada language allows you to specify what of the
package has to be exported and what is meant to be private. This is
similar to the concept of private/public methods in C++ classes, but
it works at file level. It is much similar to units in Turbo Pascal,
and it is much more effective. (Also, it allows the GNAT compiler to
recompile outdated dependencies without the need of a Makefile.)
So, how fast is it?
I was particularly ingrigued by the fact that the GNAT Ada compiler is integrated into GCC. This allows Ada code to be optimized by the same machinery GCC employs for C/C++/Fortran/ObjC code, and it can in principle guarantee the same performance. I was however puzzled by the results of the Computer Language Benchmarks Game, which clearly showed that on average C++ code required less memory and in some cases much less time to run (here is the GNAT/g++ comparison, and here the GNAT/gcc comparison, which is even worse for GNAT).
So I investigated a bit why this difference. I picked the
k-nucleotide
example,
for which the fastest C code is available at
this link.
I found that, even if Ada is considerably slower than C, it still
ranks third. In my opinion, this indicates that the C program has been
dramatically optimized, not that Ada is inefficient per se.
As a side note, I found that C/C++ programs in the Shootout are
sometimes so optimized that it is difficult to understand what they
are doing: look at this implementation of
mandelbrot
,
expecially the function calcrow
: do you now what
__builtin_ia32_cmplepd
and __builtin_ia32_movmskpd
are supposed to
do without reading the comments? Given the large number of C/C++
programmers, I bet that it's easier to find a relatively large subset
which is good in low-level optimizations and assembly coding: this
might explain why C/C++ Shootout programs often perform better than
Fortran and Ada.