The Concurrency chapter is finished (hurray!), and now I re-wade into the mysteries of Java Generics. The chapter isn't looking as bad as I was remembering, but there are still some issues that I'm struggling to understand and explain.
For example, someone pointed out that Bertrand Meyer had gone into depth on covariance and contravariance in his book "Object Oriented Software Construction" (I have the 2nd edition). I'm feeling reasonable about my grasp of covariance, but I was hoping that he might enlighten me about contravariance (that's the <? super Foo> notation). Alas, his conclusion is this, on page 626:
Contravariance, as a result, leads to simpler mathematical models of the inheritance-redefinition-polymorphism mechanism. For that reason a number of theoretical articles have advocated contravariance. But the argument is not very convincing, since, as we have seen and as the literature readily admits, contravariance is of essentially no practical use. ... So rather than trying to force a covariant body into a contravariant suit, we should accept the reality for what it is, covariant, and study ways to remove the unpleasant effects.
No particular gold to mine there. But it is interesting to see Meyer do a wholesale dismissal of contravariance, a feature which is built into J2SE5. And confusing -- did Meyer not want to put contravariance into Eiffel and therefore argue that it is a misfeature, or is it genuinely a misfeature in J2SE5? It's certainly not something that you use terribly often, although when you need it -- and so far I've only plugged it in on the basis of compiler complaints -- there doesn't seem to be any other solution.
In any event, I'm still waiting for an insight to explain contravariance.
My second question for this weblog entry concerns the neverending erasure issue. In the interim since the last time I was struggling with this chapter, it occurred to me that there might be another possible option instead of using erasure.
Erasure solves the problem of so-called "migration compatibility," so that generified clients can be used with non-generified libraries, and vice versa. But what if the Java standard library had been forked at this point, into a generic version and a pre-generic version? It seems to me that this might have worked, but I may not have thought about all the implications yet, and one of those implications may have prevented forking.
An argument against forking is the typical one, which I admit is usually legitimate: you've suddenly multiplied the required effort to maintain the forked libraries. But if that was really the reason that we have erasure, I would argue that the erasure solution moves the burden of effort from the producers of Java onto the consumers, where the total effort required (and productivity lost) will be much, much greater. So I'm hoping that the answer to this isn't: "yes, we could have forked the library but we didn't want to do the work, so you got erasure instead."
As an aside, C# also does a kind of erasure but it's not as severe as Java's, and the reason is because generics are not a C#-specific feature, but rather a feature that must work across .NET, which supports multiple languages. Not all of those languages support generics, so there must be a kind of partial erasure to allow for that lack of support. Too bad that they couldn't have gone the other way and said that all .NET languages needed some kind of generic support.
Nice paper on the xxx-variance is: Covariance and Contravariance: Conflict without a Cause Castagna (1994) (is on citesser)
The essential need for co-/contra-variance comes from the Liskov substitution principle, if we would like to change also signatures of methods in descended classes. It's easiest to demonstrate it on an example: 1. we have a (parameter) hierachy P0 <- P1 <- P2 (i.e., P1 inherits from P0..) 2. we have a (result) hierarchy R0 <- R1 <- R2
and we have a class
class A0: R0 some_function(P1 arg)
and we have, for example, a list of A0s. It can contain references to any descendand of A0 and we want to be able to statically verify type correctness of an expression like:
A0 as for a in as: x = a.some_function(p)
where x is of type R0 and p of type P1 (but again, we can pass instance of any descendand of P1 in its place).
Is then A1 (assuming parameter types are allowed to change convariantly) descendand of A0?
class A1(A0): R1 some_function(P2)
No, because Liskov substitution principle is violated: if it uses some P2-specific code in some_function and such instance is put in the as list, where I can pass instance of P1 as its argument, I can get "message not understood" error as P1 is not required to understand the methods of P2. Hence a class A1 where function's arguments change in co-variant way can not be considered descendand of A0, as it can lead to violation of the Liskov substitution principle.
On the other hand, if contra-variant parameter type change is allowed, everything works:
class A1(A0): R1 some_function(P0)
Now, it can only use in some_function the interface P0, which means that any descendand of P1 can safely be passed to it and everything will work.
Of course, similar reasoning can be used to showm, that result type must change co-variantly for the same reasons.
The problem with forking the standard libraries is 3rd party code. If you make extensive use of such 3rd party libraries, it would be difficult to use them with generic code until the libraries were available in generic form. This would slow down the adoption of generics and in a nasty feedback effect also slow down the rate at which 3rd party libraries are converted to generics. Even without 3rd party libraries it means you have to convert all of your own code that is touched by generics in one go. You can't readily break it up. Forking the libaries would make the adoption of generics a much more disruptive change. Microsoft could get away with a more disruptive change because there wasn't so much existing code.
Fine - they needed erasure to support 3rd party libraries. What is needed then is a new keyword or something that turns erasure off for the 99% of new code that doesn't need to maintain compatibility. It's so ridiculous that I can't instantiate an instance of the parameterized type.
Take a look at the NextGen compiler, it does what you want. You can compile generic code that doesn't need backward compatability with it instead of the Sun compiler. It is based on the Sun compiler and introduces a new keyword.
Presumebly when everyone is using generics Sun can change to the next gen system. It was original proposed by Guy Steele at Sun.