Guido van Rossum is the author of Python, an interpreted, interactive object-oriented programming language. In the late 1980s, Van Rossum began work on Python at the National Research Institute for Mathematics and Computer Science in the Netherlands, or Centrum voor Wiskunde en Informatica (CWI) as it is known in Dutch. Since then, Python has become very popular among developers, who are attracted to its clean syntax and reputation for productivity.
In this interview, which is being published in six weekly installments, Van Rossum gives insights into Python's design goals, the source of Python programmer productivity, the implications of weak typing, and more:
Bill Venners: I've met many people who like Python because they feel more productive using it. When I went from C++ to Java, I found myself way more productive in Java. People who know both Java and Python usually tell me they are way more productive in Python. Where does the productivity come from when programming in Python?
Guido van Rossum: There are many different sources. One is that Python requires a lot less typing.
Bill Venners: Finger typing?
Guido van Rossum: Finger typing. It wouldn't surprise me if the amount of typing Python requires is five times less than Java for a typical piece of code. That would be the ratio. When you have that much less code, it's so much easier to maintain, and also to change.
This is all very informal, but I heard someone say a good programmer can reasonably maintain about 20,000 lines of code. Whether that is 20,000 lines of assembler, C, or some high-level language doesn't matter. It's still 20,000 lines. If your language requires fewer lines to express the same ideas, you can spend more time on stuff that otherwise would go beyond those 20,000 lines.
A 20,000-line Python program would probably be a 100,000-line Java or C++ program. It might be a 200,000-line C program, because C offers you even less structure. Looking for a bug or making a systematic change is much more work in a 100,000-line program than in a 20,000-line program. For smaller scales, it works in the same way. A 500-line program feels much different than a 10,000-line program.
Bill Venners: Do you consider less finger typing the main source of productivity when programming in Python? Why do people feel more productive?
Guido van Rossum: Another source of productivity is Python's powerful built-in data types. Take arrays, for example. Python does not require you to declare an array size. You don't need to worry that what you want to hold doesn't fit in the number of elements you've declared. Our array type is flexible. It has maybe a dozen methods. You can insert elements in the middle and everything else just moves up automatically. You can delete. You can also slice: work on a slice of an array at a time. If you want a function to act on a piece of an array, for example, you can just pass a slice as a single argument. You don't need to pass in the original array with a start-and-stop index.
Python has a very efficient and powerful dictionary type, which is like an associative array or hash. The dictionary type is used a lot internally by the Python implementation. That almost guarantees that it is efficient, because we spent a lot of time making the language efficient. But you can use the same dictionary data type for your application. Dictionary is probably the main data type that every application uses.
Over the years Python's data types have collected a lot of useful functionality. The data types are powerful. That makes Python very powerful. And yet because Python has a very small set of data types, it doesn't feel like you have to spend a year learning the language before you know everything that's there. The minimal finger typing and powerful data types work together to make your program small, and make you feel more productive.
Bill Venners: To what extent do Python's weak typing and libraries help programmers feel productive?
Guido van Rossum: Weak typing probably helps you feel productive, because it lets you bend the rules a bit when deciding what goes into your array. You can have an array of numbers that occasionally has something else inside, if that doesn't bother you. A single sort method sorts arrays of completely heterogeneous elements, as long as you can compare them individually. You don't have to commit to integers and floating point numbers, for example. They all mix properly. You also have long integers, which are arbitrary precision integers. In many cases, this is useful—either because you need just a little more than 32 bits, or because you need a lot more than 32 bits. Long integers are available in Java, but they're in a separate package and are much harder to use because you don't have operator overloading. In Python, you can just write
x + y, and it doesn't matter whether
y are small integers or integer numbers with several thousand bits.
People also feel more productive due to the "batteries included" feeling: the standard library includes many things. The standard library has implementations of the various Internet protocols—HTTP, FTP, NNTP, IMAP, POP, SMTP. There's a higher-level API to which you just give a URL and it returns something that acts like a file. You can just read bytes from it and it gives the data corresponding to that URL on the remote site. A lot of necessary functionality is already in the libraries, available with minimal typing.
In addition to the standard library, the Python community has produced many third- party libraries. For years the standard library didn't have cookie support, but there was an excellent third-party cookie module. About one or two years ago that cookie module was added to the standard library; however, it was available to people who needed it long before then. All those aspects let people feel more productive.
Bill Venners: I've heard that one of the problems with Python is that it invites less design. Because people can so easily write their programs quickly in Python, they tend to just charge ahead without any planning. Do you think verbosity, such as the verbosity of Java compared to Python, tends to make people want to plan more?
Guido van Rossum: Planning is good and bad. If you know where you're going, planning is good. If you don't know what you'll encounter on the way, you should be more open-minded and improvisational. I certainly see a place for planning, but if the language forces you to plan everything, there may be trips you'll never undertake because it would require too much thinking ahead. You're then inhibited by fears that you don't know how to do something. In Python, you can start doing that something and discover how to do it on the way. You can build something quickly, get it on the road, obtain feedback, and then design the next one based on greater understanding of the problem domain.
Bill Venners: I often feel that creating software is like mixing cement. It starts out soft and hardens over time as you stir it. You can't see everything in advance, so you can't plan everything up front. You can make a plan, but once you try it you realize it would be much better to do it another way. So you iterate. Each time you iterate you move towards a better solution, and the body of code that depends on your choices grows. Eventually it all just kind of hardens. You end up with released interfaces that you're stuck with.
Guido van Rossum: And then you really have to be backwards compatible.
Bill Venners: So perhaps being able to make changes gets you farther ahead before things harden. Maybe things don't harden as early in Python because you go through the iterations more quickly.
Guido van Rossum: Yeah. I always feel that the interesting programming jobs are the ones in which you don't know exactly where you'll end up. Implementing another spreadsheet is boring. There have been many spreadsheets. We know what the user interface should be like. We know what the right implementation techniques are. We know a whole bunch of things. Yes, you can write another spreadsheet. You can probably plan it very well, because you can do a feature-by-feature checklist of all the other known spreadsheets. I want this. I want that. I want to fix this problem with this particular one. I want to avoid that bug. That's easy, but it's not very interesting.
I like programming problems where you think, "There has to be something really interesting over there, but I can't see it clearly." All you can do is move one step over there, with a small bit of code, and start exploring to see it more clearly. And maybe it actually wasn't there, it was over here. Or it had a different shape than you thought initially. Maybe it wasn't interesting at all, and you didn't waste a lot of time.
The danger of planning is that you plan for the contingencies you know about, but by definition you don't plan for things you don't know you'll encounter. So when you do encounter an unexpected event in your programming endeavor, you have to fix many interfaces and change multiple method signatures. If you've already committed to your original plan and that's no longer where you're going, then you have a problem.
I'm not particularly worried by the fact that people say you can prototype more easily in Python, but eventually the Java version makes it easier to build a robust large system. You can prototype in Python. Once you've explored the space more, you can do the planning and design that the Java version requires. If you start writing in Java knowing as little as you did when you started writing the Python version, you'll waste way more time exploring than actually building the system you'll eventually build.
The Python world has some nice examples of this. Early on, when the Web was just becoming interesting for things like shopping online, a small company called eShop was developing various commerce servers. eShop had proprietary protocols and proprietary applications, and it realized that it should just be able to use a Web server and Web browser. The developers decided to do a prototype in Python. Because they used Python, the eShop developers were the first to release a beta version compared to many other startups doing exactly the same thing. They never actually released working code after their beta release, because the company was acquired by Microsoft. Microsoft spent two or three product revisions to eventually replace all the Python code with C++. But if eShop hadn't started in Python, it would never have released something interesting enough for Microsoft to acquire in the first place.
That also happened with Yahoo. Yahoo Mail started out as a successful Python application. Again, because the developers used Python, they could respond quickly to the user feedback. And that's an application that almost everybody can use. They saw many things wrong with their application, and they responded to that quickly and added new features. Because they were doing something new, they didn't know exactly what people would need from an email Web application. It is different from a program that has your email on your computer. Access times are different. All sorts of things are different. So they were learning about what those differences were. And again, I think Yahoo may now have replaced all the Python code with C++ or some other language, but the Python prototype was essential in order to get there.
Come back Monday, February 3 for Part IV of this conversation with Python creator Guido van Rossum. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.
Python.org, the Python Language Website:
Microsoft press release about their acquisition of EShops, Inc.:
Introductory Material on Python:
Python FAQ Wizard:
Guido van Rossum's home page:
Other Guido van Rossum Interviews:
Bill Venners is president of Artima Software, Inc. and editor-in-chief of Artima.com. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project that produced the ServiceUI API. The ServiceUI became the de facto standard way to associate user interfaces to Jini services, and was the first Jini community standard approved via the Jini Decision Process. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community. He currently devotes most of his energy to building Artima.com into an ever more useful resource for developers.