For languages and published APIs breaking compatibility has a cost, but so does creeping entropy.
Guido van Rossum recently made the slides from his talk on Python 3000 at ACCU 2006 available for download. I was not at ACCU 2006, but I did see Guido give a practice run of the same talk a few weeks ago at a local ACCU meeting in San Jose. In this talk, Guido discussed the philosophy and features of Python 3000, a major new version of Python that will be developed alongside the current stream of 2.x versions. One of the main bullet points on Python 3000 philosophy is:
Allow incompatible changes; within reason
When developing new versions of languages and published APIs, maintaining backwards compatibility is usually one of the designer's main concerns. By maximizing backwards compatibility, the designer minimizes the cost for existing users of the language or API to upgrade to the new version. However, with each new release there are invariably changes that would improve the design, but which would break backwards compatibility. Over time, always choosing backwards compatibility over design improvement means that entropy creeps into the language or API. This entropy too has a cost: the language or API becomes more complex and confusing for developers to use.
One of the most basic kinds of breaking change a designer can make in a new version is to remove features that had been deprecated in previous versions. For example, Python 2.2 introduced "new-style" classes, to provide Python with a unified object model with a full meta-model. Old-style classes, which starting with version 2.2 were referred to as "classic" classes, were still supported for backwards compatibility. The intention was that new code, however, should use new-style classes. One of the first thing Guido did on the Python 3000 codebase was remove classic classes. "It felt good to rip them out," he said.
New versions of Java have made breaking changes at the source level too. For example, the introduction of the assert keyword in version 1.4 broke code that used assert as a variable or method name (such as just about any JUnit TestCase, whose calls to assert() needed to be changed to assertEquals()). Nevertheless, my observation is that the Python community has been more willing to accept breaking changes than the Java community. As Guido put it, "Java has a very long backwards compatibility cycle." Many methods in the Java API have been deprecated, but I know of none that have been actually removed. The result is that Java over time has become more and more bloated with historical design artifacts, while Python remains more lean, mean, and clean.
What do you think is the correct tradeoff between backwards compatibility and entropy reduction for Python and Java? Does Python go too far to clean things up from release to release? Does Java not go far enough?
An idea that I like is to drop deprecated methods X versions or Y years, whichever is longer, after it first becomes deprecated.
So for java.util.Date, the methods were deprecated in 1.1 and could have been dropped for 1.3. From my recollection of the 1.3 days, I was more interested in what was coming down with 1.4, than making sure I wasn't using features of 1.3 not in 1.1.
Back in the day when applets were thought to be the key, I can see how 100% backwards compat. was important. Nowadays, though, who doesn't specify what JRE must be installed before running a Java app?
As long as Python 3000 doesn't become Perl 6 or Duke Nukem Forever, it's good to break compatibility. I prefer using a language that implements current state of the art ideas, if at all possible.
I don't care about backward compatibility if the new version also comes with all the batteries. If new Python comes without the batteries, that will be a problem.
So, as long as it comes out relatively quickly and with batteries, then great. I think there are many languages out there that are better than Python, but they have fewer or worse batteries.
It would be nice to produce a compatibility module or tool to minimize the migration pain.
It would also be nice if Python 3000 could be embedded right inside Python 2.4.x and vice versa, Python 2.4.x could be embedded into Python 3000. This kind of dual-python environment could greately ease the migration by allowing people to slowly phase into Python 3000 and phase out of Python 2.4.x. I've seen a demo somewhere where a hacker has embedded Lua into Python and Python into Lua (sorry I forgot his name!). I think that's a great idea for Python 3000.
Whatever Python community decides to do, I do pray they break compatibility as much as possible in order to make/keep Python a state of the art, highly dynamic, highly interactive, yet relatively simple and elegant language.
> Back in the day when applets were thought to be the key, I > can see how 100% backwards compat. was important. > Nowadays, though, who doesn't specify what JRE must be > installed before running a Java app?
Even if you specify the JRE, API compatibility is not just an issue for the J2SE, but also for third-party libraries. That's a potentially an even bigger issue, especially if a library is widely used.
There are now many widely-used Java libraries outside the JDK (e.g., Apache commons, etc), and since many third-party tools depend, in turn, on those libraries, API entropy can become a real problem. I've seen some APIs get around this with trying versioning the API itself - before using the API, you'd query for the current version. But, surely, that leads to code explosion, since you'd possibly have to cover all API versions. Most people would just be conservative and fall back to a lowest common denominator.
The other problem is that a user wanting to use a tool that requires the new version of a library, if his own code uses the old version, would now be forced to invest the time to upgrade to that new library version. If such changes are frequent or significant, then the cost of using a platform or language or API really can increase over time, all in the name of making the future better especially for new users.
Yes, in Java this is really disgusting. You try to keep your code clean - refactor all deprecated methods for what?
After years the old crap stays in and you have to justify yourself why you a) made he effort of refactoring and b) for the glitches you introduced by refactoring (or better "adapting") the codebase.
On the other hand: I personally use Java 5+6 only at home. At work we are stuck with 1.4 (WebLogic still has issues with its Java 5 implementation) or better 1.3 since we have to maintain compatabiilty with J# projects.
Point is: Sometimes - especially in enterprise scale - you simply cannot follow all trends, you need a veeery long breath to support deprecated features.
Or perhaps not - perhaps it would be better to introduce incompatabilty to force a decision. As long as it is possible to work along, it will be done (in some organizations).
Some years ago I read an editorial on Windows compatabilty. The author made the comparison with owners of 60's beetles (Käfer) asking Volkswagen to provide compatability for their roof racks for their 90's rabbits (Golf). It is obvious that such a request is ridiculous, but in software we think of it as an essential feature?
> It would be nice to produce a compatibility module or tool > to minimize the migration pain. > Guido discussed about migration in his talk. He said that for the kind of changes they are considering, not everything could be mechanically translated, because some of the changes are semantic, not simply syntactic. Moreover, Python's dynamic typing makes it a bit harder to know whether an invocation of a method named draw() is an invocation of draw on a class whose draw() method has been renamed paint().
But even if you have static typing, such as in Java or C++, a mechanical translator doesn't always solve every problem. Many years ago, I wrote several DSLs that generated C++ code that called into an early version of the Object Windows Library from Borland. Borland decided on a subsequent release to refactor the API in ways that would break clients, and provided a migration tool that would run through and convert old-style client code to new-style. Unfortunately, since much of my code was generated, I still had to do much of this work by hand.
Guido suggested that a pychecker-like tool would likely be able to do 80% or more of the job for you, hopefully more, but that you'd probably still have to do some of it by hand.
> Guido discussed about migration in his talk. He said that > for the kind of changes they are considering, not > everything could be mechanically translated, because some > of the changes are semantic, not simply syntactic.
But I wonder if Guido's arguments are correct? In the Py3K mailing list he argues that division i/j will change its behaviour when i and j are integers. While Python 2.X assumes floor-division for integers Py3K will return a float. But you never know a priori whether i and j are indeed integers because they are untyped! So that's a hard issue for a translator, isn't it?
The obvious solution to this problem is to replace i/j by a function call on (i,j):
Here olddiv is a Py3K function and the replacement takes place in the translation phase of Python 2.X code. So each division expression in Python2.X code is uniformly replaced by a call of olddiv. It is required that Python 2.X semantics can be expressed using Py3K statements.
> ... you never know a priori whether i and j are > indeed integers because they are untyped! So that's a hard > issue for a translator, isn't it? > This goes back to ye olde static versus dynamic typing debate. In a statically typed language like Java, a migration tool has more information to work with, and will usually be able to do a more thorough, consistent job of automatically modifying code broken by a breaking API or language change. Yet the Java community seems less willing to accept breaking changes in the name of cleanup than the Python community. I'm not sure why that is.
I really do care about backwards compatibility. Although I'd rather not look at the extra cruft that builds up over time, in general I'd rather not pay for the time it takes to get my old code to work with the new release.
One other Java issue is incomplete implementation of new features / technology. Swing is especially bad. For example, in JDK 5, JComboBox, JList etc. still work only with arrays and Vectors, not Collections. So you convert all your code to use cool new parametrized Collections, work which is probably worth it anyway, then you do icky hacks like myList.toArray() to work with Swing.
That's *FOUR* major releases (1.2, 1.3, 1.4, 5) and still Swing hasn't upgraded.
Two examples I find striking in the JAVA API is the
method and the lack of Set operations in the Set interface.
Integer as a class has nothing to do with system properties. So why does it convert System properties strings into integers. This shows a clear lack of object orientated design as there should be a system properties object with a method getInt(); or such.
The name Set suggests some mathematical connotation and in some respect it fulfills the expectation. However, you might want to find the intersect of two sets.
While my experience with Python is much less, I do feel that Java requires a big spring clean in its core API. This wil cost running projects but many of those haven't even bothered to migrate from 1.3. But it is costing Java in new projects where there is less legacy or more resources for a redesign that can use new features.
New language features is one thing, but an API needs to migrate with time as well. And JAVA has too much 1.0 garbage left.
Give it a complete cleaning for Dolphin? They should just accelerate all deprecations for eol. Considering how conservative those people are who wouldn't not use deprecated information, and how far off Dolphins release is, why wouldn't this be a good idea?
> Kay Schluehr wrote: > > > ... you never know a priori whether i and j are > > indeed integers because they are untyped! So that's a > hard > > issue for a translator, isn't it? > > > This goes back to ye olde static versus dynamic typing > debate. In a statically typed language like Java, a > migration tool has more information to work with, and will > usually be able to do a more thorough, consistent job of > automatically modifying code broken by a breaking API or > language change.
But a sufficiently reflective system should provide the same amount of information at runtime. Proposing the kind of wrapper functions like olddiv should help - modulo the additional runtime penalty for type-checks. Otherwise I don't claim that Python has yet those reflective capabilities. For instance one cannot determine in which module a certain function is defined. Using relative imports is always a good advise but what to do if not?
> Yet the Java community seems less > willing to accept breaking changes in the name of cleanup > than the Python community. I'm not sure why that is.
Maybe because there wasn't much pressure on Python yet while Java people also care a lot about binary compatibility which is a non-issue to Pythonistas mostly due to the open source character of the Python project.
> I really do care about backwards compatibility. Although > I'd rather not look at the extra cruft that builds up over > time, in general I'd rather not pay for the time it takes > to get my old code to work with the new release.
If there would always be gateways for code migrations via translators that can be plugged into import hooks more aggressive language refactorings would indeed be possible. But this would alter the character of the language because one would simply couple a piece of code with a language version and would do a language oriented refactoring only if one needs new features that were introduced in a later incompatible release. It would be a somehow strange practice just like changing 18-th century english texts preserving their ambiance and prevent anachronisms.
I've held the opinion for years that there should be a fixed roadmap for removing deprecated functionality.
Keep it in for one, release after it's deprecated, then remove it. Maybe keep the method call one release further but have it throw a RuntimeException when accessed. A special DeprecationException should be created for that.
That gives application and library developers on average (given the JDK release cycle) 18 months to release an updated version of their application or library.
And end users can if they so desire always keep several JREs installed side by side, every production build from 1.1 upwards is still available from Sun and if a piece of software is released on CD it usually comes complete with a compatible JRE anyway.