Python – Type checking as opposed to multiple functions

coding-styledynamic-typingpython

In statically-typed languages such as Java, code such as the following is common (not much of a realistic example, I know):

public String flip(String text) {
    String result = "";
    for (int i = text.length() - 1; i >= 0; i++) {
        result += text.charAt(i);
    } 
    return result;
}

public int flip(int value) {
    return (-1) * value;
}

However in dynamic languages, there's no way to dispatch a specific method based on a parameter's type. So for example in Python 3, we'd have two options on how to do this. First is to check the type inside the function.

def flip(value):
    if isinstance(value, str):
        return value[::-1]
    elif isinstance(value, int):
        return (-1) * value
    else:
        raise TypeError('Expected str or int.')

Second option is to have multiple functions:

def flip_int(value):
    return value * (-1)

def flip_str(text):
    return text[::-1]

(In dynamic languages such as Python we should try to employ duck-typing, but I'm talking about cases where that's not possible, such as the example above).

Which option should I prefer, and why? Also, is there a 'best-practice' for this in Python?

I'm asking especially about Python, but this is relevant I believe to any dynamic language.

Best Answer

Including the type name inside the function is vastly preferable because it allows the programmer to specify precisely what they intend. It also makes your code a lot easier to test. If you do want implicit dispatch depending on the type, you can always offer an optional wrapper. So I would write your code as:

def flip_int(x):
  return -x

def flip_str(x):
  return x[::-1]

def flip(x):
  if isinstance(x, int): return flip_int(x)
  if isinstance(x, str): return flip_str(x)
  raise TypError("flip(x) requires x to be int or str")

Incidentally, you have effectively re-invented method dispatch. This run-time dispatch over function arguments is slightly different from “traditional” method dispatch in that the dispatch is handled by the function that is called, not by the value that the function is called on. In its generalization for more than one argument, this mechanism is known as multi-methods. The rest of this answer discusses argument-based method dispatch in the context of Java and Python.

Your Java example is exhibits function overloading, also known as ad-hoc polymorphism. In most languages, ad-hoc polymorphism is part of static dispatch, i.e. the overload is resolved at compile-time. When your declare two functions

String flip(String);
int flip(int);

the compiler recognizes them as two distinct functions that happen to share a name. So the compiler would store it as something like

flip = flip_String | flip_int;
String flip_String(String);
int flip_int(int);

When the compiler then parses a function call flip(x), it will first resolve the static type of x. Because flip is overloaded, it will try to see which overloads actually match. How that works depends on the exact rules of the language. E.g. some languages allow implicit conversions, so the types might not match exactly. In type systems with subtyping, a type will also match all its base types. This can cause multiple overloads to be legal call targets, so many languages have some kind of ranking method. E.g. calls that don't require conversions are preferred. With subtyping, the most specific type is preferred.

So finally, if the term x has type String, the overload flip_String would be chosen.

Of course, the static nature of this dispatch restricts its expressiveness. E.g. assume that class A inherits from B. We now have two functions void f(A) and void f(B). Given B b = new A(); f(b), the overload chosen will always be f(B) since that's the static type of the variable b, even if its runtime value is of the more specific type A.

Everything[1] that can be done statically in ahead of time compilation can also be done dynamically during runtime. While there might not be a static type in dynamic languages, the dispatch mechanism now has access to the precise dynamic type of a value. This means that runtime overload resolution can be more expressive than static resolution[1]: we can dispatch on the dynamic type of values. The mechanism is identical to the static version of the dispatch: the dispatch mechanism needs a set of available dispatch targets. When the overloaded function is called, the correct overload is selected.

[1]: Some languages allow return-type polymorphism, which is admittedly a bit difficult to do in dynamic languages unless the language is also lazy and supports symbolic execution.

This dynamic overload resolution is related to method dispatch. When we have a class A that is a subtype of class B and x is an instance of either of those classes, then x.foo() can do something entirely different depending on what its dynamic type is. The visitor pattern can be used to generalize this single dispatch to double dispatch, where a call x.visit(y) is resolved based on the dynamic type of both x and y. This can be generalized to multimethods, where a function is dispatched on the dynamic type of all its arguments.

Unsurprisingly, Python already has a package for multimethods:

from multimethods import multimethod

@multimethod(int)
def flip(x):
  return -x

@multimethod(str)
def flip(x):
  return x[::-1]

print(flip(42))
print(flip("foo"))

Some languages use multimethods as their primary dispatch mechanism, e.g. the Common Lisp Object System, or the Julia language. However, most cases of polymorphism can be either resolved statically, or simplified to single dispatch. Therefore, built-in language support for multimethods tends to have negligible performance impact when compared with multimethods that are added to an existing language via metaprogramming.

Implementing multi-methods can be non-trivial, because two separate overloads might look like equally good candidates. Again, assuming that A is a subtype of B:

void foo(A, B);
void foo(B, A);
A x; A y;
foo(x, y);

Both candidates would match equally well. There are various approaches to resolve this. One solution might be to find the overall “best” match using some kind of subtyping distance, but that tends to be hard to understand for programmers. It is more reasonable to look at all arguments from left to right, and order the overloads if one candidate is a better match for this single argument. E.g. in this example, the first argument is of type A, so foo(A, B) would be preferred over foo(B, A). The second argument matches both of these cases and does not change the existing order.

The mentioned Python module ignores these difficulties, and instead dispatches to the first match, no matter how well it actually matches. This requires a user to put the more specific cases before the more general case. The module's main dispatch logic boils down to this:

def __call__(self, *args):
  for types, function in self.typemap:
    if len(args) == len(types) and all(isinstance(a, t) for a, t in zip(args, types)):
      return function(*args)
  raise ValueError("multimethod: no matching method found")

You can see the full source yourself under r0k3/multimethods on GitHub, or install the module via pip install multimethods.

Related Topic