Error codes vs exceptions: critical code vs typical code

September 24th, 2012

Error codes or exceptions – which is better? Here's my answer:

They have the same worst case – a human error can lead to a complete disaster.
Exceptions are far safer for most code.
Error codes are far safer for well-reviewed, critical code.

(As you can see from 2 and 3, I believe that most code is not critical and/or poorly reviewed; I think most people will agree on that one.)

Worst case: disaster

Here's a disaster with error codes (based on an example from a critique of Go's error handling):

seal_presidential_bunker()
trigger_doomsday_device()

If we fail to seal_presidential_bunker, we still trigger_doomsday_device, because the programmer forgot to check the error code. Human error has lead to a disaster.

(The original article doesn't specify exactly what the disaster is. One problem is that the presidential staff is not safe. Another problem is that the doomsday device got triggered – which wouldn't happen if an exception were thrown and left uncaught. Which of the two problems is the bigger part of the disaster depends on your worldview.)

Here's a disaster with exceptions.

open_the_gate()
wait_for_our_men_to_come_in()
close_the_gate()

If wait_for_our_men_to_come_in throws an exception, then we'll never close_the_gate, and the enemy will sneak in. Again – human error, disaster.

So in theory, exceptions and error codes are equally bad.

Exceptions are safer for most code

Most code doesn't trigger doomsday devices, nor deals with lethal enemies at the gates. When most code messes up, garbage appears on the screen or in log files, and a programmer shows up to debug the problem.

With exceptions, it's easier for the programmer to figure out why this garbage appeared, because the failure occurs closer to the point of the error.

f=open_users_file()
print_users_list(f)

If open_users_file() throws an exception, then the programmer will see a "No such file or directory" with a call stack and think, "why couldn't this idiot [possibly, me] bother to check if the file is there?" Then he fixes the bug and all is well again.

If open_users_file() returns an invalid file object (similarly to, for example, C++'s ifstream), then print_users_list (which doesn't check errors, either) might print an empty user list. The error might then become "No such user", or "Permission denied", etc. The program will fail further from the point of error – the file opening code – and you'll need to go back and figure out where the error is.

For production code, failing early isn't necessarily better. Failing early is what leaves the gate open for the enemies in the above example. Failing early due to a floating point error – instead of trying further just in case – was reportedly the root cause of the explosion of Ariane 5, costing $.5G.

But for most code, which:

doesn't lead to such large damages
is written rather hastily
is expected to have bugs
has programmers constantly attending to it and fixing those bugs

...for most code, failing early is better simply because it always makes debugging easier – even if it doesn't make the impact of the error smaller.

Error codes have another horrible anti-debugging quality: loss of information. Even if the program fails early with error codes, you usually only get the code of the topmost layer without all the details from below.

With an exception, you get a call stack, and an error string from the bottom layer. With a perror(), you get just an error string from the bottom layer ("No such file or directory" – which file? Who wants it?). With error codes, you get something like "User list management error" – the fact that it was a file opening error gets "swallowed" by layers of code converting low-level error codes to high-level ones.

It's possible to collect an "error code call stack" with all the information, but it's almost never done. Whereas an exception does it automatically for the laziest of programmers. Another win for whoever gets to debug the code.

Error codes are safer for well-reviewed code

Code reviews are generally easier with error codes than exceptions. Error codes mean that you must carefully look at function calls to see if the programmer handled the possible errors. Exceptions mean that you must imagine what happens if an exception is thrown anywhere in the flow.

Making sure that the opening of every gate is exception-safe – that the gate gets closed when an exception is thrown – is hard. C++ has RAII for the gate-closing (and Python has with and C# has using), and Java has checked exceptions for the exception-hunting.

But even if you have both and then some, it still seems hard. A program has a lot of intermediate states, and some of them don't make sense. An exception can leave you in this intermediate state. And it's not easy to wrap every entrance into an intermediate state using whatever exception-safety-wrapper your language gives you.

So I think it makes sense for Go – a language for writing critical production code – to shun exceptions. We use C++ with -fno-exceptions for serious production code and I think it's equally sensible.

It just doesn't make sense to write most of your code that way. In most of my code, I want to always fail early to make debugging easier, seeing the full context of the error, and I want that to happen without putting much thought into error handling.

And this is why I think exceptions should be embraced by all lazy programmers writing low-quality code like myself.