What makes cover-up preferable to error handling

October 2nd, 2009

There was a Forth tutorial which I now fail to find that literally had a "crash course" right in the beginning, where you were shown how to crash a Forth interpreter. Not much of a challenge – `echo 0 @ | pforth` does the trick for me – but I liked the way of presentation: "now we've learned how to crash, no need to return to that in the future".

So, let's have a Python & Perl crash course – do something illegal and see what happens. We'll start with my favorite felony – out-of-bounds array access:

python -c 'a=(1,2,3); print "5th:",a[5]'
perl -e '@a=(1,2,3); print "5th: $a[5]\n"'

The output:

5th:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
IndexError: tuple index out of range
5th:

Python in fact crashed, telling us it didn't like the index.

Perl was more kind and rewarded our out-of-bounds index with what looks like the empty string. Being the kind of evildoer who's only further provoked by the gentle reactions of a do-gooder, what shall we do to further harass it? Well, it looks like anything makes a good index (and I mean anything: if @a=(11,22), then $a["1"]==22 and $a["xxx"]==11). But perhaps some things don't make good arrays.

python -c 'x=5; print "2nd:",x[2]'
perl -e '$x=5; print "2nd: $x[2]\n"'

Output:

2nd:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: 'int' object is unsubscriptable
2nd:

Python gives us its familiar elaborate complains, while Perl gives us its familiar laconic empty string. Its kindness and flexibility are such that in a numeric context, it would helpfully give us the number 0 – the empty string is what we get in a string context.

Is there any way to exceed the limits of Perl's patience? What about nonsensical operations – I dunno, concatenating a hash/dictionary/map/whatever you call it and a string?

python -c 'map={1:2,3:4}; print "map+abc:",map+"abc"'
perl -e '%map=(1,2,3,4); print "map+abc: ",%map . "abc\n"'

Output:

map+abc:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'str'
map+abc: 2/8abc

Python doesn't like our operands but Perl retains its non-judgmental frame of mind. A Perl detractor could point out that silently converting %map to "2/8" (hash size/reserved space) is patently insane. A Perl aficionado could point out that Perl seems to be following Python's motto "Explicit is better than implicit" better than Python itself. In Python you can't tell the type of map at the point of usage. Perl code clearly states it's a hash with %, moreover . specifically means string concatenation (as opposed to +). So arguably you get what you asked for. Well, the one thing that is not debatable is that we still can't crash Perl.

OK, so Perl is happy with indexes which aren't and it is happy with arrays which aren't, and generally with variables of some type which aren't. What about variables that simply aren't?

python -c 'print "y:",y'
perl -e 'print "y: $y\n"'

Output:

y:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'y' is not defined
y:

NameError vs that hallmark of tolerance, the empty string, a helpful default value for a variable never defined.

By the way, this is how $x[5] evaluates to an empty string when x isn't an array, I think. $x[5] is unrelated to the scalar variable $x, it looks for the array variable @x in another namespace. There's no @x so you get an empty array, having no 5th element so you get "". I think I understand it all, except for one thing: is there any way at all to disturb the divine serenity of this particular programming language?

python -c 'a=0; print 1/a'
perl -e '$a=0; print 1/$a'

This finally manages to produce:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
Illegal division by zero at -e line 1.

The second message about "illegal" division by zero (is there any other kind?) comes from our no longer tolerant friend, making me wonder. What is so special about division by zero? Why not be consistent with one's generally calm demeanor and return something useful like "" or 0? Would be perfectly reasonable – I did it myself, more accurately asked to have a hardware divider returning zero in these cases. Because there wasn't what hardware calls exception handling (having the processor jump to a handler in the middle of an instruction stream). We lived happily ever after, so what's wrong with 0?

But the real question is, what explains the stunning difference between Perl's and Python's character? Is it philosophical, "There's More Than One Way To Do It (TMTOWTDI)" vs "There should be one – and preferably only one – obvious way to do it; Although that way may not be obvious at first unless you're Dutch" (actual Perl and Python mottos, respectively)? The latter approach encourages to classify program behavior as "erroneous" where the former tends to instead assume you're knowingly doing something clever in Yet Another Possible Way, right?

Modernism vs Postmodernism, maybe, as outlined by Perl's author in "Perl, the first postmodern computer language"? "Perl is like the perfect butler. Whatever you ask Perl to do, it says `Very good, sir,' or `Very good, madam.' ... Contrast that with the Modern idea of how a computer should behave. It's really rather patronizing: `I'm sorry Dave. I can't allow you to do that.'" The latter can be illustrated by Python's way of answering the wish of its many users to use braces rather than indentation for scoping:

>>> from __future__ import braces
SyntaxError: not a chance

So, "Very good, sir" vs "I can't allow you to do that". Makes sense with Python vs Perl, but what about, say, Lisp vs C++?

Lisp definitely has a "There's More Than One Way To Do It" motive in its culture. Look how much control flow facilities it has compared to Python – and on top of that people write flow macros, and generally "if you don't like the way built-in stuff works, you can fix it with macros", you know the drill. Guessing the user's intent in ambiguous situations? Perl says that it "uses the DWIM (that's "Do What I Mean") principle" for parsing, borrowing a term from Lisp environments. And yet:

(print (let ((x (make-array 3))) (aref x 5)))
*** - AREF: index 5 for #(NIL NIL NIL) is out of range
(print (let ((x 5)) (aref x 2)))
*** - AREF: argument 5 is not an array
(print (let ((m (make-hash-table))) (concatenate 'string m "def")))
*** - CONCATENATE: #S(HASH-TABLE :TEST FASTHASH-EQL) is not a SEQUENCE
(print y)
*** - EVAL: variable Y has no value
(print (/ 1 0))
*** - division by zero

5 out of 5, just like Python. Contrast that with C++ which definitely has a Bondage and Discipline culture, what with all the lengthy compiler error messages. Actually C++ would score 4 out of 5 on this test, but the test is a poor fit for statically typed languages. A more appropriate way to evaluate C++'s error handling approach would be to focus on errors only detectable at run time. The following message from The UNIX-HATERS Handbook records the reaction of a lisper upon his encounter with this approach:

Date: Mon, 8 Apr 91 11:29:56 PDT
From: Daniel Weise
To: UNIX-HATERS
Subject: From their cradle to our grave.

One reason why Unix programs are so fragile and unrobust is that C coders are trained from infancy to make them that way. For example, one of the first complete programs in Stroustrup’s C++ book (the one after the β€œhello world” program, which, by the way, compiles into a 300K image), is a program that performs inch-to-centimeter and centimeter-to-inch conversion. The user indicates the unit of the input by appending β€œi” for inches and β€œc” for centimeters. Here is the outline of the program, written in true Unix and C style:

#include <stream.h>

main() {
  [declarations]
  cin >> x >> ch;
    ;; A design abortion.
    ;; This reads x, then reads ch.
  if (ch == 'i') [handle "i" case]
  else if (ch == 'c') [handle "c" case]
  else in = cm = 0;
    ;; That’s right, don’t report an error.
    ;; Just do something arbitrary.
[perform conversion] }

Thirteen pages later (page 31), an example is given that implements arrays with indexes that range from n to m, instead of the usual 0 to m. If the programmer gives an invalid index, the program just blithely returns the first element of the array. Unix brain death forever!

You could say that the sarcasm in the Lisp-style comments proudly intermixed with C++ code is uncalled for since example programs are just that – example programs. As to the dreaded out-of-bound array access cited in the second example, well, C++ doesn't handle that to avoid run time overhead.

But the cited program didn't just ignore the range problem the way C would – it went to the trouble of checking the index and then helpfully returned the 0th element the way Perl would. Probably as one part of its illustration how in C++ you could have custom array types which aren't like C arrays. But why Postmodern Perl arrays, in a generally Disciplined language?

Well, it was 1991 and C++ exceptions were very young, likely younger than the cited example programs. (They've since aged but didn't improve to the point where most library users would be happy to have to catch them, hence many library writers aren't throwing them.)

Likewise, Perl had all the features used in the examples above before it had exceptions – or more accurately before it had them under the spotlight, if I understand this correctly. (In Perl you handle exceptions by putting code in an eval { ... } block and then calls to the die() function jump to the end of that block, saving die's argument to $@ – instead of, well, dying. I think Perl had this relatively early; however it seems to only have become idiomatic after Perl 5's OO support and an option to send exception objects to $@, with people using die strictly for dying prior to that.) Perhaps Perl's helpful interpretations of obvious nonsense like $a["xxx"] aren't that helpful after all, but what would you rather have it do – die()?

AFAIK Python had exceptions under the spotlight from the beginning – although similarly to Perl it had exception strings before it had exception classes. And in fact it does its best to adhere to its "Errors should never pass silently" philosophy, the few deviations coming to mind having to do with Boolean contexts – the falsehood of empty strings, None and 0 together with None!=False/0==False/1==True/2!=True and similar gateways to pornographic programming.

Lisp has conditions and restarts which blow exceptions out of the water, hence its willingness to report errors isn't surprising. However, it gained these features in the 80s; what did previous dialects do? ERRORSET, which is similar to a try/catch block, appears to predate the current error handling system, but it doesn't seem to have been there from the very beginning, either. I'm not familiar with the Lisp fossil record, but there's a function for indexing lists called NTH which returns NIL given an out-of-bounds index. Lists definitely predate arrays, so I assume NTH likewise predates AREF which complains given a bad index. Perhaps NTH doesn't complain about bad indexes since it also predates ERRORSET and any other form of exception handling?

The pattern seems to be: if the language has exceptions, most of its features and libraries handle errors. If it doesn't, they don't; errors are covered up.

(Although I won't be surprised if I'm wrong about the Lisp history part because Lisp is generally much more thoughtful than I amΒ  – just look at all the trouble Interlisp, by now an ancient dialect, went into in order to figure out whether a user wants to get an opportunity to fix an error manually or would rather have the program silently return to the topmost ERRORSET.)

awk and early PHP lack exceptions and are happy with out-of-bound array access. Java and Ruby have exceptions and you'll get one upon such access. It isn't just the culture. Or is it? Perl is PHP's father and awk is PHP's grandfather. *sh and make, which, like Perl, produce an empty string from $nosuchvar, aren't good examples, either – sh is Perl's mother and make is Perl's sister. Is it really the Unix lineage that is at fault as suggested by a dated message to a late mailing list?

Here's JavaScript, an offspring of Lisp:

>>> [1,2,3][5]+5
NaN
>>> [1,2,3][5]+"abc"
"undefinedabc"

I think this definitely rivals Perl. Apparently it's not the lineage that is the problem – and JavaScript didn't have exceptions during its first 2 years.

The thing is, errors are exceedingly gnarly to handle without exceptions. Unless you know what to do at the point where the error is detected, and you almost never know what to do at the point where the error is detected, you need to propagate a description of the error up the call chain. The higher a function sits up the call chain, the more kinds of errors it will have to propagate upwards.

(With a C++ background it can look like the big problem doesn't come from function calls but from operators and expressions which syntactically have nowhere to send their complaints. But a dynamic language could have those expressions evaluate to a special Error object just as easily as it can produce "" or "undefined". What this wouldn't solve is the need to clutter control flow somewhere down the road when deciding which control path to take or what should go to files or windows, now that every variable can have the value Error tainting all computations in its path.)

Different errors carry different meta-data with them – propagating error codes alone ought to be punishable by death. What good is a "no such file" error if I don't know which file it is? What good is a "network layer malfunction" error if I can't tell that in fact it's a "no such file" error at the network layer because a higher level lumps all error codes from a lower level into one code? (It always lumps them together since the layers have different lifecycles and the higher level can't be bothered to update its error code list every time the lower level does.)

Different meta-data means, importantly for static languages, different types of output objects from callee functions depending on the error, which can only be handled in an ugly fashion. But even if there's no such problem, you have to test every function call for errors. If a function didn't originally return error information but now does, you have to change all points of call.

Nobody ever does it.

It is to me an upper bound that apparently all programming languages without exception handling cover up errors at the language level.

A designer of a successful programming language, whatever you think of that language, is definitely an above average programmer, actually I think one that can safely be assumed to occupy the top tenth no matter how you rate programming ability (and really, how do you?). Moreover, the language designer is also in the top tenth if programmers are sorted by the extent of importance they assign to their work and motivation to get things right – because the problems are fundamental, because of the ego trip, because of everything.

And still, once decision or chance lead them into a situation where the language has no exceptions, they allow themselves to have errors covered up throughout their cherished system. Despite the obvious pain it causes them, as evident from the need for rationalizations ranging from runtime overhead (as if you couldn't have an explicit unsafe subset for that – see C#) to the Postmodern Butler argument ("Fuck you plenty? Very good, sir!").

What does this leave us to expect from average programs? The ones written in the order of execution, top to bottom, with the same intellectual effort that goes into driving from point A to point B? The ones not exposed to the kind of massive testing/review that could exterminate bugs in a flophouse?

This is why nothing inspires my trust in a program like a tendency to occasionally wet its pants with an error call stack. Leaving some exceptions uncaught proves the programmers are somewhat sloppy but I wouldn't guess otherwise anyway. What's more important is that because of exceptions along the road, the program most likely won't make it to the code where shooting me in the foot is implemented. And if a program never spills its guts this way, especially if it's the kind of program with lots of I/O going on, I'm certain that under the hood, it's busy silencing desperate screams: where there are no exceptions, there must be cover-up.