Error handling prelude

"From then on, when anything went wrong with a computer, we said it had bugs in it."

- Grace Hopper

Writing programs that behave well under expected conditions is a good start. It's when a program encounters unexpected situations where it gets really challenging. Proper error handling is an important but often overlooked practice in software development. Most error handling, in general, falls into three categories:

  • Recoverable errors that are expected to happen due to the user and the environment interacting with the program, for example, a file not found error or a number parse error.
  • Non-recoverable errors that violate the contracts or invariants of the program, for example, index out of bounds or divide by zero.
  • Fatal errors that abort the program immediately. Such situations include running out of memory, and stack overflow.

Programming in the real world often entails dealing with errors. Examples include malicious input to a web application, connection failures in network clients, filesystem corruption, and integer overflow errors in numerical applications. In the event of there being no error handling, the program just crashes or is aborted by the OS when it hits an unexpected situation. Most of the time, this is not the behavior we want our programs to exhibit in unexpected situation. Consider, for example, a real-time stream processing service that fails to receive messages from clients at some point in time due to a failure in parsing messages from a client who is sending malformed messages. If we have no way to handle this, our service will abort every time we have parsing errors. This is not good from a usability perspective and is definitely not a characteristic of network applications. The ideal way for the service to handle this situation is to catch the error, act upon it, pass the error log to a log-aggregation service for later analysis and continue receiving messages from other clients. That's when a recoverable way of handling errors comes into the picture, and is often the practical way to model error handling. In this case, the language's error handling constructs enable programmers to intercept errors and take action against them, which saves the program from being aborted.

Two paradigms that are quite popular when approaching error handling are return codes and exceptions. The C language embraces the return code model. This is a very trivial form of error handling, where functions use integers as return values to signify whether an operation succeeded or failed. A lot of C functions return a -1 or NULL in the event of an error. For errors when invoking system calls, C sets the global errno variable upon failure. But, being a global variable, nothing stops you from modifying the errno variable from anywhere in the program. It's then for the programmer to check for this error value and handle it. Often, this gets really cryptic, error-prone, and is not a very flexible solution. The compiler does not warn us if we forget to check the return value either, unless you use a static analysis tool.

Another approach to handling errors is via exceptions. Higher-level programming languages such as Java and C# use this form of error handling. In this paradigm, code that might fail should be wrapped in a try {} block and any failure within the try{} block must be caught in a catch {} block (ideally, with the catch block immediately after the try block). But, exceptions also have their downsides. Throwing an exception is expensive, as the program has to unwind the stack, find the appropriate exception handler, and run the associated code. To avoid this overhead, programmers often adopt the defensive code style of checking for exception-throwing code and then proceeding forward. Also, the implementation of exceptions is flawed in many languages, because it allows ignorant programmers to swallow exceptions with a catch all block with a base exception class such as a throwable in Java, thereby resulting in a possibly inconsistent state in the program if they just log and ignore the exception. Also, in these languages, there is no way for a programmer to know by looking at the code whether a method could throw an exception, unless they are using methods with checked exceptions. This makes it hard for programmers to write safe code. Due to this, programmers often need to rely on the documentation (if it exists at all) of methods to figure out whether they could throw an exception.

Rust, on the other hand, embraces type-based error handling, which is seen in functional languages such as OCaml and Haskell, and at the same time also appears similar to C's returning error code model. But in RUST, the return values are proper error types and can be user-defined, The language's type system mandates handling error states at compile time. If you know Haskell, it is quite similar to its Maybe and Either types; Rust just has different names for them, that is, Option and Result for recoverable errors. For non-recoverable errors, there's a mechanism called panic, which is a fail-hard error handling strategy and it is advisable to use it as a last resort when there is a bug or violation of an invariant in the program.

Why did Rust choose this form of error handling? Well, as we have already said, exceptions and their associated stack unwinding have an overhead. This goes against Rust's central philosophy of zero runtime costs. Secondly, exception-style error handling, as it is typically implemented, allows ignoring these errors via catch-all exception handlers. This creates the potential for program state inconsistency, which goes against Rust's safety tenet.

With the prelude aside, let's dig into some recoverable error handling strategies!

