(My language is "better" than your language!?)
The "Phonebook"
benchmark for comparing programming languages
In the context of a
controlled experiment on a different issue, I have recently obtained several
dozen different implementations of the same program written in Java, C, or C++.
A comparison of these programs found quite interesting results (see "Comparing
Java vs. C/C++ efficiency differences to inter-personal differences",
Communications of the ACM 42(10):109-112, October 1999): Although the
differences in memory consumption and runtime between Java and C/C++ were quite
large, the differences between the individual implementations within each
language were even larger.
I believe it would be tremendously interesting to see corresponding results
for many more languages, in particular scripting languages, because all
benchmarks I have seen so far rely on but a single implementation (per language)
of each program.
Hence, the purpose of this website is collecting many implementations of this
same program in scripting languages for comparing these languages with each
other and with the ones mentioned above. The languages in question are
The properties of interest for the comparison are
- programming effort
- program length
- program readability/modularization/maintainability
- elegance of the solution
- memory consumption
- run time consumption
- correctness/robustness
Interested?
If you are interested in participating in this study, please
create your own implementation of the Phonecode program (as described
below) and send it to me by email.
I will collect programs until December 18, 1999. After that date, I
will evaluate all programs and send you the results. The effort involved
in implementing phonecode depends on how many mistakes you make underways. In
the previous experiment, very good programmers typically finished in about 3 to
4 hours, average ones typically take about 6 to 12 hours. If anything went badly
wrong, it took much longer, of course; the original experiment saw times over 20
hours for about 10 percent of the participants. On the other hand, the problem
should be much easier to do in a scripting language compared to Java/C/C++, so
you can expect much less effort than indicated above.
Still interested?
Great! The procedure is as follows:
- Read the task
description for the "phonecode" benchmark. This describes what the program
should do.
- Download
- the small test dictionary test.w,
- the small test input file test.t,
- the corresponding correct results test.out,
- the real dictionary woerter2,
- a 1000-input file z1000.t,
- the corresponding correct results z1000.out,
- or all of the above together in a single zip file.
- Fetch this program
header, fill it in, convert it to the appropriate comment syntax for your
language, and use it as the basis of your program file.
- Implement the program, using only a single file.
(Make sure you measure
the time you take separately for design, coding and testing/debugging.) Once
running, test it using test.w, test.t, test.out only, until it works
for this data. Then and only then start testing it using woerter2,
z1000.t, z1000.out.
This restriction is necessary because a similar
ordering was imposed on the subjects of the original experiment as well --
however, it is not helpful to use the large data earlier, anyway.
- A note on testing:
- Make sure your program works correctly. When fed with woerter2
and z1000.t it must produce the contents of z1000.out
(except for the ordering of the outputs). To compare your actual output to
z1000.out, sort both and compare line by line (using diff,
for example).
- If you find any differences, but are convinced that your program is
correct and z1000.out is wrong with respect to the task
description, then re-read the task description very carefully. Many people
misunderstand one particular point.
(I absolutely guarantee that
z1000.out is appropriate for the given requirements.)
If (and
only if!) you stil don't find your problem after re-reading the requirements
very carefully, then read this hint.
- Submit your program by email to prechelt@ira.uka.de,
using
Subject: phonecode submission and preferably inserting your
program as plain text (but watch out so that your email software does not
insert additional line breaks!)
- Thank you!
Constraints
- Please make sure your program runs on Perl 5.003, Python 1.5.2, Tcl 8.0.2, or Rexx as of
Regina 0.08g, respectively.
It will be executed on a Solaris platform (SunOS 5.7), running on a Sun
Ultra-II, but should be platform-independent.
- Please use only a single source program file, not several files, and give
that file the name phonecode.xx (where xx is whatever suffix is common for
your programming language).
- Please do not over-optimize your program. Deliver your first reasonable
solution.
- Please be honest with the work time that you report; there is no point in
cheating.
- Please design and implement the solution alone. If you cooperate with
somebody else, the comparison will be distorted.
Note that this web site will close down on December 18, 1999.
Lutz Prechelt, prechelt@ira.uka.de,
Last modified: Thu Nov 18 12:54:06 MET 1999