Python "compile-time" type checking

(August 2012)

CAUTION: Obsolete What I describe below was a nice hack, back in 2012.

But nowadays, we have something much better: Just use mypy.

I love Python. I really, really do. From it's extensive libraries, to its crystal-clear syntax, it's simply my favourite language. And I speak quite a few.

That being said, I do appreciate compile-time safety. I love it that C++ or OCaml compilers will "bark" - during compile time, not run-time - when a string is passed to a function expecting an integer. Python unfortunately will just crash and burn at runtime.

To compensate, I use tools - I've added flake8 support to my VIM settings, so I can catch many errors by simply pressing F7 while I edit my code. PyChecker is also a mandatory part of my release process. But neither of them can catch this:

def foo(a):
    print a + "asfda"

if __name__ == "__main__":
    foo(1)

...which crashes at run-time:

  File "./c.py", line 2, in foo
    print arg+1
TypeError: cannot concatenate 'str' and 'int' objects

And the problem recently surfaced in my work: users of my code in the European Space Agency, are writing test scenarios by calling Python functions that I've written (more accurately, Python functions that my code-generators have created)... and are therefore exposed to the inherent risks of dynamic typing. They are supposed to pass hardcoded strings or integers to a series of calls to my functions - but if they mess up, they'll get an exception at run-time, potentially minutes or hours after the test scenario begun executing...

And just like that, it hit me...

...in the case of my particular problem, I can handle this!

import ast

def foo(arg):
    print arg

def aCall():
    foo("123")

def anotherCall():
    foo(1234)

if __name__ == "__main__":
    a = ast.parse(open(__file__).read())
    for e in ast.walk(a):
        if isinstance(e, ast.Call):
            try:
                if e.func.id == 'foo':
                    print "At line", e.lineno, "you are calling 'foo' with: ", type(e.args[0])
            except:
                pass

...which gives:

bash$ python ./test.py
At line 7 you are calling 'foo' with:  <class '_ast.Str'>
At line 10 you are calling 'foo' with:  <class '_ast.Num'>

Python allows the script to parse its own "guts", via the ast module. You can therefore "parse" your own callers, and check at "compile-time" (actually, script startup-time) that they indeed called you with the types you are expecting - in my case, hardcoded strings or integers. Any such errors will be detected immediately upon startup - and therefore such crashes will NOT happen during execution of the test script.

This is not bullet-proof, of course - e.g. what if the user code calling my functions actually passes a variable that contains a string? It immediately becomes much more difficult to figure out what type the argument is:


def yetAnotherCall():
    myVar = 1234
    foo(myVar)

...yields:

bash$ python ./test.py
At line 14 you are calling 'foo' with:  <class '_ast.Name'>

Maybe Pypy or Shedskin can help - since they both try to "compile" Python code to static-typing languages (any suggestions most welcome).

But regardless of the failings of this method, and even though it doesn't solve all potential forms of the problem... in special cases like mine, where auto-generated Python code is supposed to be called with hardcoded values from its users, the ast module can indeed save the day, by detecting such usage errors at "compile-time".

Keep that in mind when coding your scripts.

Index

Updated: Sat Oct 8 11:41:25 2022