|
|
|
Sponsored Link •
|
Summary
Ken Arnold, the original lead architect of JavaSpaces, talks with Bill Venners about whether to prohibit subclassing, whether to useCloneableor copy constructors, and when to use marker interfaces.
Ken Arnold has done a lot of design in his day. While at Sun Microsystems, Arnold
was one of the original architects of Jini technology and the original lead architect of
JavaSpaces. Prior to joining Sun, Arnold participated in the original Hewlett-Packard
architectural team that designed CORBA. While at University of California, Berkeley, he
created the Curses library for terminal-independent, screen-oriented programs. In Part I of this interview, which has been published in six weekly
installments, Arnold explains why there's no such thing as a perfect design, suggests
questions you should ask yourself when you design, and proposes the radical notion that
programmers are people. In Part II, Arnold discusses the role
of taste and arrogance in design, the value of other people's problems, and the virtue of
simplicity. In Part III, Arnold addresses the concerns of
distributed systems design, including the need to expect failure, avoid state, and plan for
recovery. In Part IV, Arnold describes the basic idea of a
JavaSpace, explains why fields in entries are public, why entries are passive, and how
decoupling leads to reliability. In Part V, Arnold discusses
the data-driven nature of JavaSpaces, how JavaSpaces lets you "throw in a grain and
watch it grow," and why iteration isn't supported in the JavaSpace
interface. In this sixth and final installment, Arnold discusses whether to prohibit
subclassing, whether to use Cloneable or copy constructors, and when to
use marker interfaces.
Bill Venners: The book The Java Programming Language, which you coauthored with James Gosling, contains this paragraph:
Marking a method or classfinalis a serious restriction on the use of the class. If you make a methodfinal, you should really intend that its behavior be completely fixed. You restrict the flexibility of your class for other programmers who might want to use it as a basis to add functionality to their code. Marking an entire classfinalprevents anyone else from extending your class, limiting its usefulness to others. If you make anythingfinal, be sure you want to create these restrictions.
By contrast, in his book Effective Java, Josh Bloch suggests we
should design and document classes for inheritance or prohibit inheritance, either by
providing no accessible constructor or marking the class final. What is
your opinion?
Ken Arnold: My take is rather different. Making something
final is a radical thing to do. Marking a method final
asserts that changing the method in a subclass is improper. Marking the whole class
final asserts that even having a subclass is improper, because you made it
impossible.
For example, an instance that many people bump into in Java is the
String class. String is final. I know many
people who would have no problem if all of String's methods were
final, but they want to add some methods in a subclass that are relevant to
manipulating Strings in their environment. However, they can't extend the
notion of String in that way because String is final, and
that frustrates them. I think that is a fair frustration.
You should mark things final only if you have a good reason. If you
mark three classes final a year, you might be doing too much. For the
most part, you don't know enough about the future to know if proper and legitimate
subclasses exist.
Now you may more commonly mark methods final. In
String's case, you may not want certain methods overridden under any
circumstances because, let's say, security code depends on String
comparison to operate in a certain way. You want the equals method to
always means exactly that; therefore, you make that method final so it
can't be overridden and changed. Marking methods final is still rare, but
more likely than marking entire classes final.
On the other hand, many classes in the Java language core, especially among the older classes, have all sorts of protected data members. But nobody sat down and asked: If I were to subclass this, what would I need to do? My personal feeling is that the only difference between protected and public is a question of target audience. Protected is a variant of public, because anybody can access a protected member if he or she wants to, without much difficulty.
Something protected is designed for a certain set of people, for whom it is public. So I never make protected data members, for the same reason I never make public data members. The same rule applies: You don't want to expose your internals. You ask yourself: What does a subclasser need special access to? You wouldn't give that special access to other people, but subclassers should have it. And you want to design protected members carefully, because just like any public exposure, protected exposure limits you in the future. It is a contract. You can't change it tomorrow, because people won't accept it; in principle, you are wrong to change it. So I am against making things protected just because subclasses might want them. You should design your protected interface as carefully as you design your public interface. You ask the same questions: If I am a subclasser, what kinds of things will I want to affect? What kinds of things shouldn't I touch?
You can assume that people creating subclasses will invest more in understanding the
superclass than those just using the public interface. So you can expose sharp edges in the
protected interface you wouldn't want to expose in a public interface -- but you have to
design it. You should not mark anything protected unless you design the system to be
subclassed. But I wouldn't say everything should be final, unless you
know it can be subclassed. That is like saying the only things you can do are the things
we tell you you can do. How do you know that much about the world?
Where did you get the ability to predict other people's needs?
Bill Venners: In Effective Java, Bloch offered a
good example in which the Bloch suggests that the addAll method in class
HashSet calls add on itself. A subclass attempts to track
the total number of objects added to the set by counting the objects passed to both
addAll and add, but it doesn't work as expected because
of addAll's self-use. If someone passes three objects to
addAll, the total number of objects increments by six, not three, because
addAll calls add three times.
addAll method should document its use of the
add method. Therefore, designing for inheritance presents the problem: if
I document a method's self-use, then that method must always use itself in that way.
Documenting self-use restricts future implementation changes.
Ken Arnold: You can say the method has the option to do something without saying it will definitely do it, in some cases. But in other cases, you want to say: When I need a certain piece of internal data, I will call this method to get it. You want to document that, because you want subclasses to override the method and return a subtype of what you expect. That method's purpose is enabling subclasses to return subtypes. But, again, that is a matter of protected methods.
You certainly don't want to do things that set up sharp edges for subclassers. Yes, you
can set things up so that if someone overrides something, an unexpected event occurs.
Say you are the designer and implementer of a class, and at some point in the code you
call the method foo. If you are unpleasantly surprised when a subclass
overrides foo and does something unexpected, it seems to me there are
two possibilities.
The first possibility is that the person who subclassed foo didn't obey
foo's contract. Instead, he overrode foo and did something
wrong; in which case, that is the subclasser's problem. He shouldn't override
equals to return true only if the square root of one is the same as the other.
That is just not what the equals method means. You can't prevent people
from making that kind of mistake, so it is not your fault.
The other possibility is that your use of foo isn't properly defined in
its contract. You might be using foo in a way that relies on something
about your internal details that you don't describe, so other people interpret it incorrectly.
In principle, I can see that kind of problem happening. But in either case, one party or the
other has done something wrong.
This will also happen when methods are called during the construction phase. From the constructor, you are in a state where various things haven't been initialized. So you probably have to note which methods the class calls on itself during construction, because the methods have to be prepared for the list never having been created as opposed to having assumed it was created.
Bill Venners: Or as you said, saying that these methods may be called by the constructor, so subclassers need to make sure calling the methods during construction will work.
Ken Arnold: Or the constructor needs to make sure that the normal preconditions for calling are, in fact, set up. Although you can't do that with the subclass's own data, which is where things get really complex.
clone DilemmaBill Venners: In The Java Programming
Language, you discuss clone and suggest how to use it. You have
said to me previously that you think clone is broken.
Ken Arnold: I think clone is a real mess.
Bill Venners: But given clone's current state,
how would you recommend people use it? What do you think of Bloch's suggestion in
Effective Java to create copy constructors rather than using the
clone approach?
Ken Arnold: The problem with a copy constructor is that you have
to know the type of thing you're cloning or copying. You have to make a
Foo given an old Foo that is passed into the copy
constructor. But if the object the client holds is not a Foo, but is a
Foo subtype, then you could end up doing type truncation, which you
don't want to do.
Bill Venners: What do you mean by type truncation?
Ken Arnold: Suppose you have a Foo, which has a
subclass Bar. If you ask Foo to clone itself and you have a
Bar object, Foo will do all the Bar stuff.
But if you use this copy constructor mechanism and you tell Foo to create
a new Foo and pass in the old Bar, you will get a new
Foo instead of a Bar. You lose the bottom of the type.
Bill Venners: You're fubar.
Ken Arnold: You're definitely fubar. So using a copy constructor implies a mechanism where you ask the passed object its type. You get its class object, and invoke its copy constructor by reflection. When you do that, you are way into ugliness.
If I were to be God at this point, and many people are probably glad I am not, I would
say deprecate Cloneable and have a Copyable, because
Cloneable has problems. Besides the fact that it's misspelled,
Cloneable doesn't contain the clone method. That means
you can't test if something is an instance of Cloneable, cast it to
Cloneable, and invoke clone. You have to use reflection
again, which is awful. That is only one problem, but one I'd certainly solve.
Bill Venners: So I'm writing a class today and I want to make it easy to clone or copy. What do I do?
Ken Arnold: If you are writing a class today, you have only a few
reasonable interaction patterns. One is to go all the way. You implement the
clone method; you declare it public, without throwing a
CloneNotSupportedException if possible; and you implement
clone. I say "if possible" because you may have to be able to throw
CloneNotSupportedException. The clone method of the
container classes, for example, needs to be able to throw
CloneNotSupportedException, because although the container might be
Cloneable, the objects it references might not be.
There is also a question of deep versus shallow copies. Do you want to copy the
contents, or just the collection that refers to the same underlying objects? A shallow copy
would give you two collections, each of which refers to the same underlying objects,
rather than referring to new underlying objects. If your clone is deep, you might still have
to allow CloneNotSupportedException in your clone
method.
Another approach, which I don't prefer, is to pretend clone and
Cloneable don't exist. However, you can't do that in industrial-strength
classes. In industrial-strength classes, you would either make clone
public, without throwing the exception if you can avoid it. Or you implement
clone so that it will work if invoked, but you let each subclass decide if it
should be public. Or you override clone with something that throws
CloneNotSupportedException; in effect, stating you cannot clone this
even in subclasses. The problem with ignoring clone is that nothing can
stop a subclass from making it public and indirectly invoking your clone
method. Then, if your clone method is inherited from
Object -- in other words, if you have never written one -- and
Object's clone implementation does the wrong thing, the
user will have a corrupt class. It is better to notify users of that with
CloneNotSupportedException. So in industrial-strength classes, I would
pick one of these approaches. There is also this weird, almost surreal, approach where
you can implement Cloneable and not make your clone
method public, but I think you should never do that.
Bill Venners: So what about copy constructors? Do they have a time and place?
Ken Arnold: I am not fond of copy constructors. In fact, I'm not very fond of constructors at all. The problem is that the code that creates the object with a constructor is defining the object's type. In all other operations, the code that uses an object effectively only defines that the object is at least a certain type. Constructors are an exception to that rule. I don't think that exception should exist.
You can also think of it this way: new Foo should turn into an
invocation of a static method that might create a subclass. The static method could look at
the parameters and say: I will create a Foo subclass that is efficient for
these kinds of parameters. At the point where you want a Foo, there are
myriad reasons why the implementer of Foo may know you want a
subclass, but you don't know it. And maybe you shouldn't know it, because next week a
different set of decisions might make sense. The actual class of object that gets created is
an implementation detail. You need something that is at least a Foo. You
should go to the Foo class and say: I need something that is at least what
you are, and here are the initialization parameters. Other languages will do that; of all
things, Perl objects do that. I think this is a better solution. By calling a copy constructor
on Foo, you are asking the Foo class to get a copy of this
object that is at least a Foo, which is not the Foo class's
business.
Bill Venners: When are marker interfaces -- interfaces that lack methods -- appropriate?
Ken Arnold: I have this weird deep object view of that whole thing. I mentioned the term contract a couple times in this discussion. To me, object design is a design of contracts. Contracts define what you are allowed to rely on. Contracts are not expressible in anything less than a human language. You need to have someone sit down and say, if this is true then that happens. You have to spell out all the details.
Java has types as well as methods. There are contracts for entire types and contracts
for methods. If you look at a method contract, you'll see that not everything can be
spelled out in the programming language. Some things expressed in the contract must be
expressed in human language. String's equalIgnoreCase is
a method, but nothing in the programming language enforces that case is ignored; that
requirement is only expressed in the human language text. The contract for the whole
class or interface, then, says if you implement this interface, you will have, in general, the
following kinds of behaviors.
To me, a marker interface isn't wrong. It is a general case in the sense that it has no methods, so the whole contract is associated with the class and says what things of that type do. As a degenerate case, a marker interface probably shouldn't occur often. But on the other hand, there are times when a marker interface is a legitimate way to express that if you implement this interface, then you are have certain relationships.
Serializable might be a good example.
None of the methods related to serialization belong as methods in the Serializable interface itself.
For example, readObject should not be a public method in the
Serializable interface. It is an implementation detail. At the same time, Serializable
is a behavioral contract that says you do expect your state shipped up in little bits and then
those bits deconstructed into another copy of you. Serializable says that you have written the associated
code to make sure serialization works right. So now I know I can use the object in a
certain way--I can serialize and deserialize it. That is a real contract about an object that
has a Serializable marker interface. I do think marker interfaces are rarely
correct. But I don't think they are incorrect in principle.
I know people who are profligate with marker interfaces. They say: If the object you pass in has this marker interface, I will treat it in the following way. But I think that down that path lies darkness. Nobody knows the behavior of objects when they get their 17th marker interface. What if the object implements this marker interface and not this one, or this one and that one but not that other one? You'll send an object to me and I'll invoke a method, but because of some odd marker interface behavior, it won't do what I expect.
Nevertheless, contracts do exist where everything is outside the programming
language. In those situations, you still want to let people know that the object behaves in
a certain way. Marker interfaces let you do that.
Have an opinion about prohibiting inheritance, clone, or marker interfaces? Discuss this article in the News & Ideas Forum topic,
Java Design Issues
Resources
Perfection and Simplicity, A Conversation with Ken Arnold, Part I:
http://www.artima.com/intv/perfect.html
Taste and Aesthetics, A Conversation with Ken Arnold, Part II:
http://www.artima.com/intv/taste.html
Designing Distributed Systems, A Conversation with Ken Arnold, Part III:
http://www.artima.com/intv/distrib.html
Sway with JavaSpaces, A Conversation with Ken Arnold, Part IV:
http://www.artima.com/intv/sway.html
JavaSpaces: Data, Decoupling, and Iteration, A Conversation with Ken Arnold, Part V:
http://www.artima.com/intv/decouple.html
You can obtain information about Linda from here:
http://www.cs.yale.edu/Linda/linda.html
Ken Arnold first mentioned idempotency in Part III of this interview:
http://www.artima.com/intv/distrib.html
JavaSpaces: Principles, Patterns, and Practice
by Eric Freeman, Susanne Hupfer, and Ken Arnold,
the book from which Bill Venners reads quotes in this article,
is at Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/0201309556/
The Jini Community, the central site for signers of the Jini Sun Community Source License to interact:
http://www.jini.org
Download JavaSpaces from:
http://java.sun.com/products/javaspaces/
Design objects for people, not for computers:
http://www.artima.com/apidesign/object.html
Make
Room for JavaSpaces, Part I - An introduction to JavaSpaces, a simple and powerful distributed programming tool:
http://www.artima.com/jini/jiniology/js1.html
Make
Room for JavaSpaces, Part II - Build a compute server with JavaSpaces, Jini's coordination service:
http://www.artima.com/jini/jiniology/js2.html
Make
Room for JavaSpaces, Part III - Coordinate your Jini applications with JavaSpaces:
http://www.artima.com/jini/jiniology/js3.html
Make
Room for JavaSpaces, Part IV - Explore Jini transactions with JavaSpaces:
http://www.artima.com/jini/jiniology/js4.html
Make
Room for JavaSpaces, Part V - Make your compute server robust and scalable with Jini and JavaSpaces:
http://www.artima.com/jini/jiniology/js5.html
Make
Room for JavaSpaces, Part VI - Build and use distributed data structures in your JavaSpaces programs:
http://www.artima.com/jini/jiniology/js6.html
|
Sponsored Links
|