Python – Are exceptions for flow control best practice in Python

exceptionsprogramming practicespython

I'm reading "Learning Python" and have come across the following:

User-defined exceptions can also signal nonerror conditions. For
instance, a search routine can be coded to raise an exception when a
match is found instead of returning a status flag for the caller to
interpret. In the following, the try/except/else exception handler
does the work of an if/else return-value tester:

class Found(Exception): pass

def searcher():
    if ...success...:
        raise Found()            # Raise exceptions instead of returning flags
    else:
        return

Because Python is dynamically typed and polymorphic to the core,
exceptions, rather than sentinel return values, are the generally
preferred way to signal such conditions.

I've seen this sort of thing discussed multiple times on various forums, and
references to Python using StopIteration to end loops, but I can't find much
in the official style guides (PEP 8 has one offhand reference to exceptions
for flow control) or statements from developers. Is there anything official
that states this is best practice for Python?

This (Are exceptions as control flow considered a serious antipattern? If so, Why?) also has several commenters state that this style is Pythonic. What is this based on?

TIA

Best Answer

The general consensus “don't use exceptions!” mostly comes from other languages and even there is sometimes outdated.

In C++, throwing an exception is very costly due to “stack unwinding”. Every local variable declaration is like a with statement in Python, and the object in that variable may run destructors. These destructors are executed when an exception is thrown, but also when returning from a function. This “RAII idiom” is an integral language feature and is super important to write robust, correct code – so RAII versus cheap exceptions was a tradeoff that C++ decided towards RAII.
In early C++, a lot of code was not written in an exception-safe manner: unless you actually use RAII, it is easy to leak memory and other resources. So throwing exceptions would render that code incorrect. This is no longer reasonable since even the C++ standard library uses exceptions: you can't pretend exceptions don't exist. However, exceptions are still an issue when combining C code with C++.
In Java, every exception has an associated stack trace. The stack trace is very valuable when debugging errors, but is wasted effort when the exception is never printed, e.g. because it was only used for control flow.

So in those languages exceptions are “too expensive” to be used as control flow. In Python this is less of an issue and exceptions are a lot cheaper. Additionally, the Python language already suffers from some overhead that makes the cost of exceptions unnoticeable compared to other control flow constructs: e.g. checking if a dict entry exists with an explicit membership test if key in the_dict: ... is generally exactly as fast as simply accessing the entry the_dict[key]; ... and checking if you get a KeyError. Some integral language features (e.g. generators) are designed in terms of exceptions.

So while there is no technical reason to specifically avoid exceptions in Python, there is still the question whether you should use them instead of return values. The design-level problems with exceptions are:

they are not at all obvious. You can't easily look at a function and see which exceptions it may throw, so you don't always know what to catch. The return value tends to be more well-defined.
exceptions are non-local control flow which complicates your code. When you throw an exception, you don't know where the control flow will resume. For errors that can't be immediately handled this is probably a good idea, when notifying your caller of a condition this is entirely unnecessary.

Python culture is generally slanted in favour of exceptions, but it's easy to go overboard. Imagine a list_contains(the_list, item) function that checks whether the list contains an item equal to that item. If the result is communicated via exceptions that is absolutely annoying, because we have to call it like this:

try:
  list_contains(invited_guests, person_at_door)
except Found:
  print("Oh, hello {}!".format(person_at_door))
except NotFound:
  print("Who are you?")

Returning a bool would be much clearer:

if list_contains(invited_guests, person_at_door):
  print("Oh, hello {}!".format(person_at_door))
else:
  print("Who are you?")

If the function is already supposed to return a value, then returning a special value for special conditions is rather error-prone, because people will forget to check this value (that's probably the cause of 1/3 of the problems in C). An exception is usually more correct.

A good example is a pos = find_string(haystack, needle) function that searches for the first occurrence of the needle string in the `haystack string, and returns the start position. But what if they haystack-string does not contain the needle-string?

The solution by C and mimicked by Python is to return a special value. In C this is a null pointer, in Python this is -1. This will lead to surprising results when the position is used as a string index without checking, especially as -1 is a valid index in Python. In C, your NULL pointer will at least give you a segfault.

In PHP, a special value of a different type is returned: the boolean FALSE instead of an integer. As it turns out this isn't actually any better due to the implicit conversion rules of the language (but note that in Python as well booleans can be used as ints!). Functions that do not return a consistent type are generally considered very confusing.

A more robust variant would have been to throw an exception when the string can't be found, which makes sure that during normal control flow it is impossible to accidentally use the special value in place of an ordinary value:

 try:
   pos = find_string(haystack, needle)
   do_something_with(pos)
 except NotFound:
   ...

Alternatively, always returning a type that can't be used directly but must first be unwrapped can be used, e.g. a result-bool tuple where the boolean indicates whether an exception occurred or if the result is usable. Then:

pos, ok = find_string(haystack, needle)
if not ok:
  ...
do_something_with(pos)

This forces you to handle problems immediately, but it gets annoying very quickly. It also prevents you from chaining function easily. Every function call now needs three lines of code. Golang is a language that thinks this nuisance is worth the safety.

So to summarize, exceptions are not entirely without problems and can definitively be overused, especially when they replace a “normal” return value. But when used to signal special conditions (not necessarily just errors), then exceptions can help you to develop APIs that are clean, intuitive, easy to use, and difficult to misuse.

Related Solutions

C# – Exceptions, error codes and discriminated unions

but crashing your client's software is still not a good thing

It most certainly is a good thing.

You want anything that leaves the system in an undefined state to stop the system because an undefined system can do nasty things like corrupt data, format the hard drive, and send the president threatening emails. If you cannot recover and put the system back into a defined state then crashing is the responsible thing to do. It's exactly why we build systems that crash rather than quietly tear themselves apart. Now sure, we all want a stable system that never crashes but we only really want that when the system stays in a defined predictable safe state.

I've heard that exceptions as control flow are considered a serious antipattern

That's absolutely true but it's often misunderstood. When they invented the exception system they were afraid they were breaking structured programming. Structured programming is why we have for, while, until, break, and continue when all we need, to do all of that, is goto.

Dijkstra taught us that using goto informally (that is, jumping around wherever you like) makes reading code a nightmare. When they gave us the exception system they were afraid they were reinventing goto. So they told us not to "use it for flow control" hoping we'd understand. Unfortunately, many of us didn't.

Strangely, we don't often abuse exceptions to create spaghetti code as we used to with goto. The advice itself seems to have caused more trouble.

Fundamentally exceptions are about rejecting an assumption. When you ask that a file be saved you assume that the file can and will be saved. The exception you get when it can't might be about the name being illegal, the HD being full, or because a rat has gnawed through your data cable. You can handle all those errors differently, you can handle them all the same way, or you can let them halt the system. There is a happy path in your code where your assumptions must hold true. One way or another exceptions take you off that happy path. Strictly speaking, yeah that's a kind of "flow control" but that's not what they were warning you about. They were talking about nonsense like this:

"Exceptions should be exceptional". This little tautology was born because the exception system designers need time to build stack traces. Compared to jumping around, this is slow. It eats CPU time. But if you're about to log and halt the system or at least halt the current time intensive processing before starting the next one then you have some time to kill. If people start using exceptions "for flow control" those assumptions about time all go out the window. So "Exceptions should be exceptional" was really given to us as a performance consideration.

Far more important than that is not confusing us. How long did it take you to spot the infinite loop in the code above?

DO NOT return error codes.

...is fine advice when you're in a code base that doesn't typically use error codes. Why? Because no one's going to remember to save the return value and check your error codes. It's still a fine convention when you're in C.

OneOf

You're using yet another convention. That's fine so long as you're setting the convention and not simply fighting another one. It's confusing to have two error conventions in the same code base. If somehow you've gotten rid of all code that uses the other convention then go ahead.

I like the convention myself. One of the best explanations of it I found here^*:

But much as I like it I'm still not going to mix it with the other conventions. Pick one and stick with it.¹

_{1 : By which I mean don't make me think about more than one convention at the same time.}

Subsequent thoughts:

From this discussion what I'm taking away currently is as follows:

If you expect the immediate caller to catch and handle the exception most of the time and continue its work, perhaps through another path, it probably should be part of the return type. Optional or OneOf can be useful here.

If you expect the immediate caller to not catch the exception most of the time, throw an exception, to save the silliness of manually passing it up the stack.

If you're not sure what the immediate caller is going to do, maybe provide both, like Parse and TryParse.

It's really not this simple. One of the fundamental things you need to understand is what a zero is.

How many days are left in May? 0 (because it's not May. It's June already).

Exceptions are a way to reject an assumption but they are not the only way. If you use exceptions to reject the assumption you leave the happy path. But if you chose values to send down the happy path that signal that things are not as simple as was assumed then you can stay on that path so long as it can deal with those values. Sometimes 0 is already used to mean something so you have to find another value to map your assumption rejecting idea on to. You may recognize this idea from its use in good old algebra. Monads can help with that but it doesn't always have to be a monad.

For example²:

IList<int> ParseAllTheInts(String s) { ... }

Can you think of any good reason this must be designed so that it deliberately throws anything ever? Guess what you get when no int can be parsed? I don't even need to tell you.

That's a sign of a good name. Sorry but TryParse is not my idea of a good name.

We often avoid throwing an exception on getting nothing when the answer could be more than one thing at the same time but for some reason if the answer is either one thing or nothing we get obsessed with insisting that it give us one thing or throw:

IList<Point> Intersection(Line a, Line b) { ... }

Do parallel lines really need to cause an exception here? Is it really that bad if this list will never contain more than one point?

Maybe semantically you just can't take that. If so, it's a pity. But Maybe Monads, that don't have an arbitrary size like List does, will make you feel better about it.

Maybe<Point> Intersection(Line a, Line b) { ... }

The Monads are little special purpose collections that are meant to be used in specific ways that avoid needing to test them. We're supposed to find ways of dealing with them regardless of what they contain. That way the happy path stays simple. If you crack open and test every Monad you touch you're using them wrong.

I know, it's weird. But it's a new tool (well, to us). So give it some time. Hammers make more sense when you stop using them on screws.

If you'll indulge me, I'd like to address this comment:

How come none of the answers clarifies that the Either monad is not an error code, and nor is OneOf? They are fundamentally different, and the question consequently seems to be based on a misunderstanding. (Though in a modified form it’s still a valid question.) – Konrad Rudolph Jun 4 `18 at 14:08

This is absolutely true. Monads are much closer to collections than exceptions, flags, or error codes. They do make fine containers for such things when used wisely.

Why Using Exceptions for Validation is Bad in C#

The first issue I have with using exceptions for validation is that I would typically expect the outcome of the validation to potentially churn up multiple errors with the same data.

Exceptions are useful for non-Happy Path scenarios when some code fails and the best course of action is to unwind the stack as a consequence, passing the error much further up to something which can handle it gracefully and recover or otherwise leave the system in a good state.

Stack unwinding makes exceptions generally unsuitable for validation as I would expect the whole validation process to continue until all checks have been run after any error with the data is found, so that the user is able to have a complete report on the issues in their data from a single POST/PUT request.

Update: As pointed out in the comments, it is also technically possible to treat an exception as a return value by bundling all the validation issues into a single exception and throwing that at the end, however I also wouldn't recommend or advocate this either.

Validation is a user-facing functional feature of a system, and is about identifying issues with data, rather than errors in the behaviour of the system. A validation error being spotted still implies that the system is behaving correctly, however an exception should be used to indicate that the system or one of its dependencies is not behaving correctly.

C# / ASP.NET Specific stuff:

In the specific case of ASP.NET Core, a possible solution could be to use the Custom Model Validation described here: https://docs.microsoft.com/en-us/aspnet/core/mvc/models/validation?view=aspnetcore-2.2

Another possible alternative could be to use FluentValidation: https://fluentvalidation.net/aspnet

Best Answer

Related Solutions

C# – Exceptions, error codes and discriminated unions

Why Using Exceptions for Validation is Bad in C#

Related Topic