Delay Computation in Python with Thunks
Achieve laziness by wrapping computation into functions
Laziness in programming languages is the ability to postpone the computation of a value till the moment when the same value is really needed. We talked about laziness in the context of sequences, but today I want to discuss their usefulness for single values.
First-class Values
In programming languages, first-class values are those that can be passed as arguments and returned by functions. In Python, they are numbers (5 or 3.14), objects (like strings or instances of user-defined classes), and functions. This is very similar to most mainstream languages, although functions as first-class values were out of the mainstream for a long time. For instance, C can move around functions only with function pointers, which are more cumbersome to use; while Java forced its users to use classes with a single method in place of functions, if we needed to pass a single function around.
Functions as first-class values were mostly present in functional programming languages, such as Lisp, OCaml, and since the 90s also Javascript.
There is syntax that is not first class, and these are usually keywords: if, else, class, try, catch, these are all example of syntax that is not first class. Indeed, it would make no sense to write something like
a = f(try, return)or
def g(a, b, c):
return catchIt is worth noting that, not only these keywords cannot be used anywhere else we can use other values, but also their functioning cannot be reproduced by means of functions. For instance, if we would like to reproduce a simple if-else in Python, we may think of writing something like
def my_if_else(condition, then_action, else_action):
return (condition and then_action) or else_action
my_if_else(1 > 2, print("Greater"), print("lower"))but the two print statements are executed before the function is called, so both sentences are printed before even checking the condition.
Can we do better than this?
Thunks
In computer science, a thunk is a function used to inject computation inside another function, usually with the goal of delaying the computation itself. There is nothing magical about it: instead of executing a line of code, we put it into a function definition for a function with no arguments. When the function (the thunk) is called, the corresponding code is executed, exactly where it is needed. Let us see it in action with the previous code:
def my_if_else(condition, then_action, else_action):
return (condition and then_action()) or else_action()
my_if_else(1 > 2, lambda: print("Greater"), lambda: print("lower"))In this code, notice that we don’t use anymore then_action and else_action as values but as function calls, and we correspondingly wrap the code into lambdas to make them functions of zero arguments.
Now, the two print statements are simply written in the body of two distinct function, and are accordingly called only when and if the corresponding function is actually called. In our my_if_else function, only one between then_action and else_action is called, so we achieved our goal.
(Why) Should I care?
Thunk is not a pattern we see very often, but there are practical situations where it is needed, in particular when an existing API requires a function of zero arguments as an input.
One example is in dataclasses.field, which is used to configure the attributes of data classes. In particular, there are two keyword arguments to define a default value, one is default and the other one is default_factory. default expects a value, which will be simply assigned to the variable when no user-provided value exists. On the other side, default_factory expects a thunk, which will be called when the user does not provide a value. From the official documentation:
@dataclass
class C:
mylist: list[int] = field(default_factory=list)
c = C()
c.mylist += [1, 2, 3]c.mylist is default initialized to [] because during initialization list() is called, constructing an empty list.
Because of the way Python is designed, class names are also class factories, so we can provide any class name to default_factory and a default-initialized object of that class will be provided at construction time. But what if we do not want a default-initialized value? We can go back to the previous pattern and make a thunk out of an initialization:
@dataclass
class C:
mylist: list[int] = field(default_factory=lambda: [1, 1, 2, 3, 5])
c = C()
assert c.mylist == [1, 1, 2, 3, 5]and mylist is initialized to a list with the first 5 Fibonacci numbers. Note that using the same idea would not work without using lambda because a list object is not callable:
@dataclass
class C:
mylist: list[int] = field(default_factory=[1, 1, 2, 3, 5])
c = C()
> TypeError: 'list' object is not callableThe limitation of this approach is that our function will always perform the same action since it does not take any arguments. But the fact that in Python we can call also classes and objects allows us to partially work around it:
class CallCounter:
def __init__(self):
self.counts = 0
def __call__(self):
self.counts += 1
print(f"Called {self.counts} time{'s' if self.counts > 1 else ''}")
def f(thunk):
print("Going to perform some externally-defined computation")
thunk()
print("Now, doing it again, maybe it does the same or maybe not")
thunk()
if __name__ == '__main__':
f(CallCounter())
> Going to perform some externally-defined computation
Called 1 time
Now, doing it again, maybe it does the same or maybe not
Called 2 times
Forget not that we can also return thunks, and obtain the same but reverse effect: we call a function and obtain some computation that we can later use. For instance, we can easily get class factories:
reader_factories = {"text": TextReader, "audio": AudioReader}
data_reader = reader_factories["audio"]() # We need to invoke here
with data_reader(filename) as reader:
.
.
.Conclusion
Functions with no arguments represent a general pattern that can be used in diverse situations such as delaying computation (or decide whether to perform it at all!) or provide a uniform interface for code injection. It is useful to play with examples of it to explore its nitty-gritty and get a feeling of what is possible and what not. The expressive power of Python enables the uniform usage of functions, class factories and callable objects, which can result in surprising possibilities.

