All Things Pythonic
Optional Static Typing -- Stop the Flames!
by Guido van van Rossum
January 7, 2005

Summary
My two posts on adding optional static typing to Python have been widely misunderstood, and spurred some flames from what I'll call the NIMPY (Not In My PYthon) crowd. In this post I'm describing a scaled-down proposal with run-time semantics based on interfaces and adaptation.

In a Nutshell

Argument and return type declarations
Attribute declarations (maybe)
Interface declarations
Design by contract (maybe)

Argument and Return Type Declarations

Let's go back to the basics. A function is defined like this:

def foo(x: t1, y: t2) -> t3:
    ...body...

is more or less equivalent to this:

def foo__(x, y):  # original function
    ...body...

def foo(x, y):    # wrapper function
    x = adapt(x, t1)
    y = adapt(y, t2)
    r = foo__(x, y)
    return adapt(r, t3)

Here t1, t2 and t3 are expressions that are evaluated once, at function definition time (i.e. at the same time as argument default values). Also, the wrapper doesn't really access the original by name; more likely it is an object that has a reference to the original function somehow.

The types are standard Python expressions, there's no separate syntax for type expressions. A type can be combined with a default value:

def foo(x: int = 42):
    pass

The default value gets adapted to the given type at function declaration time.

A function has a __signature__ attribute from which the names, types, and default values of the arguments can be introspected, as well as the return type (and the types for *args and **kwds, if specified).

Until the parser has been modified to accept this syntax, we can experiment with decorators like this:

@arguments(t1, t2)
@returns(t3)
def foo(x, y):
    ...body...

but that notation has little to recommend it in the long run.

Attribute Declarations (Maybe)

It makes some sense to allow attribute declarations like this:

class C:
    x: t1

which would create a property x that calls adapt(value, t1) upon assignment.

This is syntactic sugar for something we can write today:

x = typedAttribute(t1)

The syntax can be combined with a default value:

x : t1 = default

meaning:

x = typedAttribute(t1, default)

The implementation of typedAttribute() is left as an exercise to the reader. This is only for classes, and only defines instance variables. (A mutable default will suck just like it always did.)

Interfaces

This is the only other place where I still think that new syntax is needed. My syntax proposal is still:

interface I1(I2, I3):
    def foo(a: t1, b: t2) -> t3:
        "docstring"

class C(I1):    # implements I1
    def foo(a, b):
        return a+b

The metaclass gives C.foo the __signature__ attribute and adaptation wrappers gleaned from the interface at run time. You don't have to specify argument and return types in the interface declaration; if the type is absent adapt() is not called.

The interfaces don't show up in the __bases__ attribute of the class; rather, they show up in a new __implements__ attribute (the metaclass can tell the difference between an interface and a class).

This proposal is even simpler than PEP 245 (Python Interface Syntax); I don't think the "implements" keyword proposed there is needed.

Design by Contract (Maybe)

This has received an inordinate amount of attention in the discussion forum, but I'm not very impressed with the resulting designs. There are basically two styles of proposals: statement-based and expression-based.

I think the expression-based proposals are too limited: they don't handle guards involving multiple arguments very well, and the proposed overloading of type expressions and boolean guards feels error-prone (what if I make my guard 'True' while awaiting inspiration for something better?). Also, there are clear use cases for guards that (in Python) can only be expressed using multiple statements.

But the statement-based designs are pretty cumbersome too, and I expect that in practice these will be used only in large projects. At the moment I am leaning towards not defining any new syntax for these, but instead use a decorator until we've got more usage experience. Here's a strawman proposal:

def _pre_foo(self, a, b):  # The pre-condition has the same signature as the function
    assert a > 0
    assert b > a

def _post_foo(self, rv, a, b):  # The signature inserts the return value in front!
    assert rv > b

@dbc(_pre_foo, _post_foo)   # design-by-contract decorator
def foo(self, a, b):
    return a+b

In this example, _pre_foo and _post_foo are just names I picked; they are associated with the foo method by the @dbc decorator.

An alternative proposal could use an implicit binding based on naming conventions; then pre- and post-conditions could automatically be inherited, but the metaclass has to do more work.

But if you really want my opinion, I think these should not become a part of standard Python just yet -- I'd rather see others experiment with the ideas sketched here, write a PEP, and then we can talk about standardization.

That's It!

I'm dropping the advanced and untried ideas for now, such as overloaded methods, parameterized types, variable declarations, and 'where' clauses. I'm also dropping things like unions and cartesian products, and explicit references to duck typing (the adapt() function can default to duck typing). Most of these (except for 'where' clauses) can be added back later without introducing new syntax when people feel the need, but right now they just act as red flags for the NIMPY (Not In My PYthon) crowd.

Most importantly, I'm dropping any direct connection to compile-time type checking or generating more efficient code. The adaptation wrappers will slow things down -- a price some people will gladly pay for the flexibility offered by adaptation and better run-time error checking. I expect that interface declarations will be helpful to PyChecker-like static bug finders and to optimizers using type inferencing, but these will have to deal with pretty much the entire range of dynamic usage that's possible in Python, or they will have to explicitly say that certain programming styles are not supported.

A Note on Adaptation

PEP 246 (Object Adaptation) has lots of good things to say about adaptation that I won't repeat.

When deferring to adapt() for all our type checking needs, we could give built-in types like int and list a suitably wide meaning. For example:

def foo(a: int, b: list) -> list:
    return b[:a]

This should accept a long value for a, because (presumably) adapt(x, int) returns x when x is a long; and it should accept any sequence object for b. But what if I have a sequence object that doesn't implement sort()? That method isn't used here, but it's defined by the built-in list type, so won't the default duck adaptation to list fail here?

There are a few interesting ideas here (e.g. Eiffel conformance), but in practice we'll likely end up declaring a bunch of standard interfaces that finally define carefully what it means to be an integer, sequence, mapping, or file-like object (etc.), and we'll be writing things like this instead:

def foo(a: integer, b: sequence) -> sequence:
    return b[:a]

(But most of the time you'll still be writing just this:

def foo(a, b):
    return b[:a]

:-)

Talk Back!

Have an opinion? Readers have already posted 79 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Guido van van Rossum adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Guido van Rossum is the creator of Python, one of the major programming languages on and off the web. The Python community refers to him as the BDFL (Benevolent Dictator For Life), a title straight from a Monty Python skit. He moved from the Netherlands to the USA in 1995, where he met his wife. Until July 2003 they lived in the northern Virginia suburbs of Washington, DC with their son Orlijn, who was born in 2001. They then moved to Silicon Valley where Guido now works for Google (spending 50% of his time on Python!).


	Web Artima.com