The Artima Developer Community
Sponsored Link

Contracts in Python
A Conversation with Guido van Rossum, Part IV
by Bill Venners
February 3, 2003
Summary
Python creator Guido van Rossum talks with Bill Venners about the nature of contracts in a runtime typed programming language such as Python.

Guido van Rossum is the author of Python, an interpreted, interactive object-oriented programming language. In the late 1980s, Van Rossum began work on Python at the National Research Institute for Mathematics and Computer Science in the Netherlands, or Centrum voor Wiskunde en Informatica (CWI) as it is known in Dutch. Since then, Python has become very popular among developers, who are attracted to its clean syntax and reputation for productivity.

In this interview, which is being published in six weekly installments, Van Rossum gives insights into Python's design goals, the source of Python programmer productivity, the implications of weak typing, and more:

In this installment, Van Rossum discusses the nature of contracts in a runtime typed programming language such as Python.

Methods without Contracts

Bill Venners: One promise of object-oriented programming is that a strong separation of interface and implementation facilitates change. Because client code links just to the interface, I can change method implementations and private data. I can make those kinds of changes without breaking existing client code, because that code is not coupled to the method implementations or private data. Client code is just coupled to the method signatures that make up the interface. Separating interface and implementation is really just a way of enabling change by minimizing coupling.

But this promise of easier change requires that the various people who define the interface, implement the interface, and write the client code all agree on what the interface methods mean. The interface implies a contract. If an add method is supposed to add two numbers, I can change my implementation to add more efficiently, but the method must continue to perform addition. For example, if I change the implementation such that one number is subtracted from the other, I will most likely break client code at runtime even though the client code will still link at compile time.

In a weakly typed system like Python, variables don't have types. Say I write a method and it takes two parameters, x and y. If I invoke a method on x, the method invocation will succeed at runtime if a method of that signature is declared in the object referenced by x. But I don't necessarily know what that method will do.

In a strongly typed language, each variable has a type that's known at compile time. If I attempt to invoke a method on a variable whose type doesn't declare that method, the compiler of a strongly typed language tells me of the error. That's one difference between a strongly typed and weakly typed language. In a strongly typed language, I discover such errors at compile time. In a weakly typed language, I will hopefully find out at runtime. But there's another difference. In a strongly typed language, a variable's type implies the object it references will have a particular interface. That interface doesn't just tell me at compile time what method signatures exist, it also tells me what those methods mean. It tells me what the methods promise to do.

For example, both an Artiste object and a GunSlinger object might have a method with the signature void draw(). When you invoke draw on the Artiste, it sketches a picture. When you invoke draw on the GunSlinger, it draws its pistol and shoots. In a strongly typed language, I can figure out if the object will sketch or shoot by looking at the variable's type. I know the type, and therefore draw's meaning, at compile time. In a runtime typed language like Python, I can find out draw's meaning by looking at the type at runtime, but that's rarely done in practice. I never check types at runtime when I write in Python. I just invoke draw assuming it means something, but with no guarantee.

Why does that work in practice? Why do Python programs work, if no one is certain what the method will do when it's invoked?

Guido van Rossum: That sounds like an irrational fear to me. In my experience, designing the interface is often the hardest part. The flexibility to change the implementation without changing the interface works well only for certain things. If you must sort by zip code, for example, you can input a simple sort implementation. If later you find your initial sort implementation is not fast enough, you can do more work on it. That's the classic way you use interfaces.

But in many situations you find that when people design interfaces, in the next program revision or after user feedback of a library, the interfaces are actually designed improperly. Perhaps certain information was kept private. Perhaps certain data never leaves a method that is useful to the consumer. Maybe the data is redundant. You can calculate it from the data returned from the method. But because it's calculated as a side effect of an algorithm, it's a shame to throw that away, because it forces the data's consumer to recalculate that information based on the data she receives.

Python Has Implicit Contracts

Bill Venners: I have a theory that 99 percent of the time when I invoke a method in Python it will do what I expect, because in practice it usually does. When I invoke a method in a strongly typed language, 1 percent of the time a bug or misunderstanding may cause the method to do something unexpected. Because even though the contract is clear from a variable's type in a strong typing system, it doesn't mean people don't break the contract. People do break contracts, because of bugs and misunderstandings. I suspect that in both strongly and weakly typed languages, situations in which invoked methods mean something unexpected are equally rare, and for the most part discovered and dealt with during testing.

Guido van Rossum: In Python, you have an argument passed to a method. You don't know what your argument is. You're assuming that it supports the readline method, so you call readline. Now, it could be that the object doesn't support the readline method.

Bill Venners: And then I'll get an exception.

Guido van Rossum: You'll get an exception, which is probably OK. If this is a mainline piece of code and something could possibly be passed to you that doesn't have a readline method, you'll discover that early on during testing. Just as much as in a typed language when you have an interface and you know you're getting something that has the right interface but doesn't implement the right thing, or it throws an unexpected exception. You'll hopefully find that during testing.

In addition in Python, because there aren't fixed protocols, something else can be passed that also supports readline and doesn't happen to be a file, but does exactly what you need. All you need at that point is something that returns lines.

Bill Venners: But it's also possible to pass something that supports readline that does something I don't expect.

Guido van Rossum: In general in Python, there is a contract, but the contract is implicit. The contract isn't specified by an interface. There's nothing in what the parser sees at least that says x has to be an object that supports readline that you can call with no arguments and it returns a string that means a certain thing. But that contract is certainly in the documentation or specification.

In Java, if you say this is something that has a readline method that returns a string, what does it mean? Do you expect it to always return the same string? Does it ever return an empty string? There are all sorts of things that aren't expressed by that interface that you still have to specify in documentation. That's where the interesting competition between the different languages exists.

Is the Code the Contract?

Bill Venners: In Learning Python (O'Reilly, 1999), Mark Lutz and David Ascher introduce methods with an example, times(x, y), that returns x * y:

>>> def times(x, y):
...    return x * y;
...
The authors then give two examples of invoking the times method, one that takes two integers and one that takes a string and an integer:
>>> times(2, 4)
8
>>> times('Ni', 4)
'NiNiNiNi'
When 2 and 4 are passed, the times method returns 8. When 'Ni' and 4 are passed, the times method returns 'NiNiNiNi', because the * operator on sequences (such as strings or lists) means to repeat the sequence. The authors then explain this by saying, "Recall that * works on both numbers and sequences; because there are no type declarations in methods, you can use times to multiply numbers or repeat sequences."

Doesn't the lack of type declarations on method parameters make it hard to change code, especially in public libraries? People can define a class and overload * to mean anything. They can then pass in an instance of their class to times, and use the times method in a way the method's designer never imagined, because * means something completely different. If I want to change how I multiply in the times method, I am likely breaking that code. Is the contract of a Python method not the code of the method?

Guido van Rossum: To some extent, when you're writing a public library, it is hard to change which methods or operations you apply to your arguments. I don't think that's specific to runtime typed languages though.

What the client passes in, the contract, is more restricted in a strongly typed language, but that probably gets in the way just as often as it's helpful. It is helpful because it makes clear that there's this minimal interface that you both have to agree on. It makes it difficult, because maybe in the next library version, I would like to use more properties, but those properties aren't part of the interface I said I was committed to, so then I can't do that.

Next Week

Come back Monday, February 10 for Part V of this conversation with Python creator Guido van Rossum. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.

Talk Back!

If programs written in runtime typed languages like Python work fine, despite the absence of explicit contracts known at compile time, what do you think is the real benefit of explicit contracts? Have an opinion? Discuss this article in the News & Ideas Forum topic, Contracts in Python

Resources

Python.org, the Python Language Website:
http://www.python.org/

Microsoft press release about their acquisition of EShops, Inc.:
http://www.microsoft.com/presspass/press/1996/jun96/eshoppr.asp

Introductory Material on Python:
http://www.python.org/doc/Intros.html

Python Tutorial:
http://www.python.org/doc/current/tut/tut.html

Python FAQ Wizard:
http://www.python.org/cgi-bin/faqw.py

Guido van Rossum's home page:
http://www.python.org/~guido/

Other Guido van Rossum Interviews:
http://www.python.org/~guido/interviews.html


Sponsored Links

Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us