Deep Python #2: ... and Master of None

Python patterns to work with None and avoid surprises

Jul 23, 2024

One thing I have never liked about Python is how cumbersome it is to work around None, particularly when we want to chain calls. There are good parts of course, like the possibility to mark variable types as Optional (or equivalently None), so that a static type checker can remind us to do something about it:

def read_file(filename: pathlib.Path, columns: list[int] | None):
.
.
.

But this is only where the story begins. Indeed, I find that many Python developers are still not comfortable with static type checking tools like mypy. And annotating types without automatic checks can be misleading and cause hard-to-find bugs.

Another problem is that None is very widespread in Python because of language design choices. For instance, it is really not recommended to use mutable objects as default values for function arguments, as they can be modified between calls, leading to surprising behavior. Then, if we want a mutable object as a default argument, like a list or a dict, the correct pattern is to use None as default and then replace None with the actual object inside the function:

def read_file(filename: pathlib.Path, columns: list[int] | None = None):
    if not columns:   # equivalent to "if columns is None" but a bit faster
        columns = []
.
.
.

this is a bit verbose but can be made a one-liner by using the short-circuiting “power” of logical operators:

def read_file(filename: pathlib.Path, columns: list[int] | None = None):
    columns = columns or []

In this example, columns is a falsey value if it is None, thus the right-hand side of “or” is evaluated and the final result is [], which gets assigned to columns. But when columns is an actual list, it is truthy and the right-hand side of “or” won’t be evaluated at all. It is a short form for the ternary operator:

def read_file(filename: pathlib.Path, columns: list[int] | None):
    columns = columns if columns is not None else []

The same pattern can be used with the “and” operator, but in that case the right-hand side will be evaluated if and only if the left-hand side is truthy:

@dataclass
class X:
  x: int
  y: list[str]


def do_something(maybe_x: X | None):
  my_x = maybe_x and maybe_x.x  # type: int | None

.
.
.

the above code will assign to my_x the content of maybe_x when maybe_x is an actual X and not None. In the case when it is None, then the and operator is short-circuited (because the and result is already determined to be False) and maybe_x.x is not evaluated, so the expression result will be None, exactly like maybe_x. Note that without short-circuiting we would try to call None.x, raising an exception because None has no attributes.

A problem here is that my_x can still be None, as every static type checker would tell us. We want to force it to be not None, so that we don’t have to care about this issue for the rest of the function, and get useful suggestions from auto-completion.
None values can be removed by means of a default value, as we observed above, a technique called null coalescing. We apply it also here to enforce the type int.

def do_something(maybe_x: X | None, defval: int = 5):
  my_x = (maybe_x and maybe_x.x) or defval # type: int

If the result of the and expression is falsey, then the or operator makes sure that my_x will be assigned defval, which is an int, exactly like maybe_x.x.

Now this pattern works, is idiomatic and give us a variable with a well-defined type. However it can also become quite verbose. Let us suppose that we want to extract from the class X above a value from the list y:

@dataclass
class X:
  x: int
  y: list[str]


def get_first_y(maybe_x: X | None, defval: str = ""):
    return (maybe_x and maybe_x.y and maybe_x.y[0]) or defval

and the default value is needed again because the first two expressions (maybe_x and maybe_x.y) can both return values that are not int and are falsey.

But why do we need to type so much repetition?

A glimpse to the outer world

Many other languages include the safe-navigation operator to handle such cases easily. For instance, C# and Kotlin use the ?. and ?[] operators to traverse nested attributes while writing less code:

var a = maybe_x?.y?[0]

and a will be null if any intermediate result is null, or the result of indexing y[0] if everything works as expected. Of course, the type of the returned variable can still be null, so we would need to use null coalescing again to get rid of it.

I think it would be nice in some cases to have such conciseness also in Python, and the good news is that you can actually do it, with some tradeoffs.

Dunder methods to the rescue!

Python’s dunder methods are those with double underscore before and after their name (dunder == double under). They are sometimes called “magic methods” because they are called implicitly with some syntax.

For the next example we shall use the dunder methods __getattr__ and __getitem__, which are respectively called when getting an attribute value: x.y, and when indexing an object: v[i].

We can get seamless safe navigation on selected objects by wrapping them into the following class, which you can find in this repo as an example:

class Nullable(Generic[T]):
    def __init__(self, value: T):
        self.value = value

    def __getattr__(self, item):
        try:
            return Nullable(self.value.item)
        except AttributeError:
            return Nullable(None)

    def __getitem__(self, item):
        try:
            return Nullable(self.value[item])
        except (TypeError, KeyError):
            return Nullable(None)

    def get(self) -> T | None:
        return self.value

Objects of this class wrap other objects and delegate their calls to __getattr__ and __getitem__ to the wrapped object’s own methods, but in a None-safe way. If the underlying call returns a truthy value, then this value is wrapped again into Nullable and returned, so that we can continue the chain.
When we are done traversing, we need to extract the final value using the get method.

This is a simple solution that shows the power of Python: we are basically able to change the operator semantics without a need to do complex metaprogramming. It comes with some costs though. First, we need to wrap and unwrap the objects to activate this functionality or get back the underlying value. For instance, we cannot call directly methods on a Nullable object. Second, we lose the type annotation. The two methods return Any since we don’t know the structure of the wrapped object. This may not be considered as a problem by many developers, but it surely means at the very least that we lose IDE auto-completion. Lastly, changing operator semantics is always a dangerous field to walk, since a new person that reads the code does not expect the new behavior. You may want not to slow down your colleagues or your future self, but when you are sure it is worth it because you have to deal with a lot of Nones in your codebase, then definitely go for it!

Easier to Ask for Forgiveness than Permission

A redditor commented in his own way that we may use another common Python pattern. Perform the action and then catch eventual exceptions:

def get_first_y(maybe_x: X | None, defval: str = "") -> str:
    try:
        return maybe_x.y[0]
    except (AttributeError, IndexError):
        return defval

which is definitely a valid choice, particularly if you need to do it only once. It comes with its own drawback nonetheless. The first one is that this is not going to pass mypy checks: it recognizes that maybe_x can be None, which has no attribute y. Somebody may infer that mypy conflicts with other Python’s design choices but this is not the place to discuss it. A second error is the verbosity: what can be done in a concise one-liner here takes 4 lines. And finally the exceptions to catch: we need to list all the possible exceptions that we may get and want to catch here. Too few and the code will crash anyway, too many and other bugs may slip through unnoticed.

This pattern has for sure its place but it is not “the only and true” pythonic way to do it.

Conclusion

Evaluating tradeoffs is a developer’s call to make and should be assessed case by case. In this short article I wanted to highlight some alternatives to work with None, one of the unexpected weak points of Python syntax.

Hopefully this helps with addressing the None issue, which is not usually one of the developers’ favorite topics.

Share South Software

South Software