Python Type Annotation and why we need it

Since 2014 and a release of Python 3.5, Python is transforming from Duck Typing programming language to able to include type annotation in its code. However, the usage of Python type annotation is not spread out as much as it should be. This post will introduce you to a reason why we need to add type annotation, How to use it properly and a strategy for adding a type annotation in a codebase plus development workflow.

Why we need a type annotation

The first question and the most important one is that “Why do we need it?” Why we need it for a duck-typing language like Python. Why we want to reduce our freedom to declare any variable with any type. Let’s see the example code below.

At first glance, This code seems to be obvious right? The make_request function is just a typical function to make an HTTP request which receiving url, data, and headers as its arguments. Imagine in the next six month there is a question raised for this function that “What is a data be like?” It could be text. It could be a dictionary. We never knew at the first glance from the code what it should be like for data. Now let’s see the same make_request function with type annotation on it. (Don’t pay attention to the syntax now, we will cover it later)

As you can see, The problematic data need to be in Dictionary which has an only string as a key and value. Same as headers as well. We also know that this make_request function would return a tuple of int and string in order.

This is just a simple example of 3 arguments and 2 return value. Imagine that we have a more sophisticated function which receiving more arguments. We just add a few type annotation and what we getting back is that

  • An easier code to maintain, because we explicitly state the type of the arguments of function which help other developers include ourselves understand the function easily
  • An easier code review, because the reviewer doesn’t need to guess a type of function they looking at
  • Easy to debug code, same reason as above because we don’t need to guess a type of function we’re working with
  • Validate our assumption, When we writing a complex call to multiple functions until we lost track of what we send/receiving. Using type annotation along with type checker help us proof a flow of functions call we’re working with.
  • Reduce cognitive load, Instead of remembering what we need to sent/receiving we just write a type and free our brain to think on the other stuff

What type annotation look like?

A basic function signature would be slightly different from what we used to write. We need to add :T which T is a type of arguments. Another part is -> R which is a type of return value. A type that we’re talking can be a built-in type e.g. str, int bool, float, … to collection type e.g. List, Dict, Tuple, etc. And as you see from example code. Python 2 doesn’t support this new syntax but can gain benefit when writing type as well as a comment.

Example of writing type annotation

We already cover a simple usage of type annotation. but there is more style to state a type which we will cover in this section

For laziness

This style of writing I recommended not doing in real life unless we have no other choice to do so, because It’s not different from not include type annotation. Adding Any type not help us understand a code easier and the type checker wouldn’t gain any benefit from it.

Collection type

This style of writing type annotation is easily prone to error at first because it will raise a run-time error when we using the built-in list, tuple, dict, etc. In this case, we need to use a List, Tuple, Dict and other collection types from typing module which is an alias class of these collections type.

A little note on using Tuple type annotation in Python 3.6 (PEP526) which is adding … syntax mean this Tuple are all string without a need to state all the position of a tuple

For possible None value

Sometimes, arguments we receiving can be None, in this case, we would state that this argument can be some type and can be None via Optional type. Another way we can state a possible None is s: str = None

For arguments that can be multiple types

In this case, s argument could be integer or string, Instead of we choose to be one of these types. We can state that it’s a Union type which means it can be both string and integer. This style of writing type annotation is using a lot when we writing a type annotation for a dictionary. It helps explain a type of dictionary value in detail.

For inherit type from another type

Sometimes using only built-in type code is not enough to explain the arguments, for example, an integer could be any number. casting new type from old type give us a benefit to specifying the meaning of this type more than a base type of it.

For class itself, not an object which instantiates from class

Sometimes, there is a case where we do need a type of class, not an object which created by the class. In this case, We need to wrap the Class with Type[]. It looks really useful, but I didn’t have a chance to use it even once.

In fact, There a lot more of an example of using type annotation which I recommend looking at Mypy syntax cheat sheet (Python 3) It will include every example I including above and more style that I didn’t cover e.g. variable annotation.

Type inference

From the example above, We can see that function f receiving arguments l which is a list of string. Inside the f function, there is a variable s which calling stuff inside list l. When Mypy reading this function signature it can understand that variable s is string without us state a type for variable s

The second example will reduce the scope of inference, We can see that function g having argument message which can be both string or None. If we checking the type of message via reveal_type at line 5 it still can be both string or None, but the type checker can understand condition as well which we can see when we reveal_type of message at line 7 the possible type for message will reduce to just string.

reveal_type is a built-in from MyPy package which can check the type of any variable at run-time of type checker and MyPy will display a message of a type like this.

When Mypy reveal a type

Casting new type

Imagine that we try every way to state a type via built-in, Union, Optional (but not include Any) and the type check still failed. We could create a new type via a process call cast which can be done as in the example below

Type Checker

After we adding a type annotation to our codebase. We could be using it to prove our usage that it send/receiving data between function correctly in a static type language the usage of this would be a compiler, but in Python, we will use a type checker call Mypy

When Mypy raise a TypeError

Mypy is a static type checker develop by Guido van Rossum since 2012 before the type annotation standard PEP484 is even out. To using it we just call execute Mypy followed by a python file that we want to check the type. Mypy will read type annotation and checking its usage, but there is a lot more option in Mypy where you can check out in the official document

I personally not using Mypy directly, but using it via a package call flake8-mypy, because our codebase normally would install a flake8 as a dependency. When we add flake8-mypy to our dependencies it will increase the ability of flake8 to running a type check for us via Mypy. Normally my Vim would be setting up ale to execute flake8 asynchronously which mean I almost checking a type in real time.

NeoVim when flake8-mypy raise a warning
A PyCharm message is slightly different from Mypy, but for the most part, it’s the same

For a PyCharm user, we don’t need to install any plugin because PyCharm will include its own type checker which not depend on Mypy. And for other text editors if it able to checking lint via flake8 it’s possible to config nothing, but install flake8-mypy to dependency could make it just work.

Strategy for adding type annotation

Don’t think just add

We already cover how to add a type annotation in many case and we already know a benefit of it. So just adding it more and more it our codebase is a simplest way to do it, This method have a downside because we wouldn’t get any benefit from type checker at all. Which means if we adding a wrong type signature it could be wrong it a lot of places and we need to fix it which could be the dramatic change and finally we just ignore it.

Gradual Typing

This technique using by Instagram, The point is we find a common basic function where it’s not calling other function and we add a type annotation there first. This is very important because if we not really adding on a function that common usage, We need to fix it back and forth until it right. After that, we execute type checker it will raise a wrong type used on a function that calling this common function and the rule is

Fix only function that type check raise, Other function we treat it as receive/return Any

This way, our codebase will slightly change from none type annotate to be strong typed covered. This process takes time, but in the end, it would reduce a lot of TypeError and NoneType exception.

Another thing we should do is that we should add a type checker like Mypy as a part of our Continuous Integration workflow as well. To prevent the type annotation that we introduce would still be correct all the time.

Automate generate type using MonkeyType

When we adding type for a code base sometimes it’s really frustrating for a large code base because there is a lot of place to add. Luckily Instagram creates a tool call MonkeyType which can be reading a type of a function from a run-time where run-time could be an actual run-time or a Unit test and keep it in a file. Then it will generate type from the run-time it stubs in to and then we can patch the function itself without a need to add it one by one. This way we could at a lot of type annotation at once and growth it very quickly. The downside is if we stub the unit test runtime where we mocking a type we need to fix those type one by one. We can learn more about MonkeyType via this link

Where to go next?

Type annotation still a new topic in Python, So there is still small I would recommend a few places where we can study more on type annotation

  • If we want to learn more on how to write type annotation for each use case, The best place I recommend is the official document of MyPy here http://mypy.readthedocs.io/en/latest/index.html and you can skip to Cheatsheet where it covers most of the use case add type annotation.
  • My personally prefer would recommend studying on PEP itself that related to Type Annotation which is PEP484 PEP526 PEP544 PEP563
  • This post got inspiration and some example from this Talk from PyCascade 2018 which I really recommend. This talk will cover the basic and some issues when they tried to add type to the codebase of Instagram

Pythonista @ProntoTools ♥ Python, Django, Vim and Star Trek 🖖