(My language is "better" than your language!?)
The "Phonebook" 
benchmark for comparing programming languages
In the context of a 
controlled experiment on a different issue, I have recently obtained several 
dozen different implementations of the same program written in Java, C, or C++. 
A comparison of these programs found quite interesting results (see "Comparing 
Java vs. C/C++ efficiency differences to inter-personal differences", 
Communications of the ACM 42(10):109-112, October 1999): Although the 
differences in memory consumption and runtime between Java and C/C++ were quite 
large, the differences between the individual implementations within each 
language were even larger.
I believe it would be tremendously interesting to see corresponding results 
for many more languages, in particular scripting languages, because all 
benchmarks I have seen so far rely on but a single implementation (per language) 
of each program.
Hence, the purpose of this website is collecting many implementations of this 
same program in scripting languages for comparing these languages with each 
other and with the ones mentioned above. The languages in question are 
The properties of interest for the comparison are 
  - programming effort 
  
- program length 
  
- program readability/modularization/maintainability 
  
- elegance of the solution 
  
- memory consumption 
  
- run time consumption 
  
- correctness/robustness 
Interested?
If you are interested in participating in this study, please 
create your own implementation of the Phonecode program (as described 
below) and send it to me by email.
I will collect programs until December 18, 1999. After that date, I 
will evaluate all programs and send you the results. The effort involved 
in implementing phonecode depends on how many mistakes you make underways. In 
the previous experiment, very good programmers typically finished in about 3 to 
4 hours, average ones typically take about 6 to 12 hours. If anything went badly 
wrong, it took much longer, of course; the original experiment saw times over 20 
hours for about 10 percent of the participants. On the other hand, the problem 
should be much easier to do in a scripting language compared to Java/C/C++, so 
you can expect much less effort than indicated above. 
Still interested?
Great! The procedure is as follows: 
  - Read the task 
  description for the "phonecode" benchmark. This describes what the program 
  should do. 
  
  
- Download 
  
    - the small test dictionary test.w, 
    
- the small test input file test.t, 
    
- the corresponding correct results test.out, 
    
- the real dictionary woerter2, 
    
- a 1000-input file z1000.t, 
    
- the corresponding correct results z1000.out, 
    
- or all of the above together in a single zip file. 
  
 
- Fetch this program 
  header, fill it in, convert it to the appropriate comment syntax for your 
  language, and use it as the basis of your program file. 
  
  
- Implement the program, using only a single file.
 (Make sure you measure 
  the time you take separately for design, coding and testing/debugging.) Once 
  running, test it using test.w, test.t, test.out only, until it works 
  for this data. Then and only then start testing it using woerter2, 
  z1000.t, z1000.out.
 This restriction is necessary because a similar 
  ordering was imposed on the subjects of the original experiment as well -- 
  however, it is not helpful to use the large data earlier, anyway.
- A note on testing: 
  
   
    - Make sure your program works correctly. When fed with woerter2 
    and z1000.t it must produce the contents of z1000.out 
    (except for the ordering of the outputs). To compare your actual output to 
    z1000.out, sort both and compare line by line (using diff, 
    for example). 
    
    
- If you find any differences, but are convinced that your program is 
    correct and z1000.out is wrong with respect to the task 
    description, then re-read the task description very carefully. Many people 
    misunderstand one particular point.
 (I absolutely guarantee that 
    z1000.out is appropriate for the given requirements.)
 If (and 
    only if!) you stil don't find your problem after re-reading the requirements 
    very carefully, then read this hint.
 
- Submit your program by email to prechelt@ira.uka.de, 
  using
 Subject: phonecode submission and preferably inserting your 
  program as plain text (but watch out so that your email software does not 
  insert additional line breaks!)
- Thank you! 
Constraints
  - Please make sure your program runs on Perl 5.003, Python 1.5.2, Tcl 8.0.2, or Rexx as of 
  Regina 0.08g, respectively. 
  It will be executed on a Solaris platform (SunOS 5.7), running on a Sun 
  Ultra-II, but should be platform-independent. 
  
  
- Please use only a single source program file, not several files, and give 
  that file the name phonecode.xx (where xx is whatever suffix is common for 
  your programming language). 
  
  
- Please do not over-optimize your program. Deliver your first reasonable 
  solution. 
  
  
- Please be honest with the work time that you report; there is no point in 
  cheating. 
  
  
- Please design and implement the solution alone. If you cooperate with 
  somebody else, the comparison will be distorted. 
  
Note that this web site will close down on December 18, 1999. 
Lutz Prechelt, prechelt@ira.uka.de, 
Last modified: Thu Nov 18 12:54:06 MET 1999