Java Design Issues

A Conversation with Ken Arnold, Part VI

by Bill Venners

October 14, 2002

Summary

Ken Arnold, the original lead architect of JavaSpaces, talks with Bill Venners about whether to prohibit subclassing, whether to use Cloneable or copy constructors, and when to use marker interfaces.

Ken Arnold has done a lot of design in his day. While at Sun Microsystems, Arnold was one of the original architects of Jini technology and the original lead architect of JavaSpaces. Prior to joining Sun, Arnold participated in the original Hewlett-Packard architectural team that designed CORBA. While at University of California, Berkeley, he created the Curses library for terminal-independent, screen-oriented programs. In Part I of this interview, which has been published in six weekly installments, Arnold explains why there's no such thing as a perfect design, suggests questions you should ask yourself when you design, and proposes the radical notion that programmers are people. In Part II, Arnold discusses the role of taste and arrogance in design, the value of other people's problems, and the virtue of simplicity. In Part III, Arnold addresses the concerns of distributed systems design, including the need to expect failure, avoid state, and plan for recovery. In Part IV, Arnold describes the basic idea of a JavaSpace, explains why fields in entries are public, why entries are passive, and how decoupling leads to reliability. In Part V, Arnold discusses the data-driven nature of JavaSpaces, how JavaSpaces lets you "throw in a grain and watch it grow," and why iteration isn't supported in the JavaSpace interface. In this sixth and final installment, Arnold discusses whether to prohibit subclassing, whether to use Cloneable or copy constructors, and when to use marker interfaces.

Prohibiting or Designing for Inheritance

Bill Venners: The book The Java Programming Language, which you coauthored with James Gosling, contains this paragraph:

Marking a method or class final is a serious restriction on the use of the class. If you make a method final, you should really intend that its behavior be completely fixed. You restrict the flexibility of your class for other programmers who might want to use it as a basis to add functionality to their code. Marking an entire class final prevents anyone else from extending your class, limiting its usefulness to others. If you make anything final, be sure you want to create these restrictions.

By contrast, in his book Effective Java, Josh Bloch suggests we should design and document classes for inheritance or prohibit inheritance, either by providing no accessible constructor or marking the class final. What is your opinion?

Ken Arnold: My take is rather different. Making something final is a radical thing to do. Marking a method final asserts that changing the method in a subclass is improper. Marking the whole class final asserts that even having a subclass is improper, because you made it impossible.

For example, an instance that many people bump into in Java is the String class. String is final. I know many people who would have no problem if all of String's methods were final, but they want to add some methods in a subclass that are relevant to manipulating Strings in their environment. However, they can't extend the notion of String in that way because String is final, and that frustrates them. I think that is a fair frustration.

You should mark things final only if you have a good reason. If you mark three classes final a year, you might be doing too much. For the most part, you don't know enough about the future to know if proper and legitimate subclasses exist.

Now you may more commonly mark methods final. In String's case, you may not want certain methods overridden under any circumstances because, let's say, security code depends on String comparison to operate in a certain way. You want the equals method to always means exactly that; therefore, you make that method final so it can't be overridden and changed. Marking methods final is still rare, but more likely than marking entire classes final.

On the other hand, many classes in the Java language core, especially among the older classes, have all sorts of protected data members. But nobody sat down and asked: If I were to subclass this, what would I need to do? My personal feeling is that the only difference between protected and public is a question of target audience. Protected is a variant of public, because anybody can access a protected member if he or she wants to, without much difficulty.

Something protected is designed for a certain set of people, for whom it is public. So I never make protected data members, for the same reason I never make public data members. The same rule applies: You don't want to expose your internals. You ask yourself: What does a subclasser need special access to? You wouldn't give that special access to other people, but subclassers should have it. And you want to design protected members carefully, because just like any public exposure, protected exposure limits you in the future. It is a contract. You can't change it tomorrow, because people won't accept it; in principle, you are wrong to change it. So I am against making things protected just because subclasses might want them. You should design your protected interface as carefully as you design your public interface. You ask the same questions: If I am a subclasser, what kinds of things will I want to affect? What kinds of things shouldn't I touch?

You can assume that people creating subclasses will invest more in understanding the superclass than those just using the public interface. So you can expose sharp edges in the protected interface you wouldn't want to expose in a public interface -- but you have to design it. You should not mark anything protected unless you design the system to be subclassed. But I wouldn't say everything should be final, unless you know it can be subclassed. That is like saying the only things you can do are the things we tell you you can do. How do you know that much about the world? Where did you get the ability to predict other people's needs?

Bill Venners: In Effective Java, Bloch offered a good example in which the addAll method in class HashSet calls add on itself. A subclass attempts to track the total number of objects added to the set by counting the objects passed to both addAll and add, but it doesn't work as expected because of addAll's self-use. If someone passes three objects to addAll, the total number of objects increments by six, not three, because addAll calls add three times.

Bloch suggests that the addAll method should document its use of the add method. Therefore, designing for inheritance presents the problem: if I document a method's self-use, then that method must always use itself in that way. Documenting self-use restricts future implementation changes.

Ken Arnold: You can say the method has the option to do something without saying it will definitely do it, in some cases. But in other cases, you want to say: When I need a certain piece of internal data, I will call this method to get it. You want to document that, because you want subclasses to override the method and return a subtype of what you expect. That method's purpose is enabling subclasses to return subtypes. But, again, that is a matter of protected methods.

You certainly don't want to do things that set up sharp edges for subclassers. Yes, you can set things up so that if someone overrides something, an unexpected event occurs. Say you are the designer and implementer of a class, and at some point in the code you call the method foo. If you are unpleasantly surprised when a subclass overrides foo and does something unexpected, it seems to me there are two possibilities.

The first possibility is that the person who subclassed foo didn't obey foo's contract. Instead, he overrode foo and did something wrong; in which case, that is the subclasser's problem. He shouldn't override equals to return true only if the square root of one is the same as the other. That is just not what the equals method means. You can't prevent people from making that kind of mistake, so it is not your fault.

The other possibility is that your use of foo isn't properly defined in its contract. You might be using foo in a way that relies on something about your internal details that you don't describe, so other people interpret it incorrectly. In principle, I can see that kind of problem happening. But in either case, one party or the other has done something wrong.

This will also happen when methods are called during the construction phase. From the constructor, you are in a state where various things haven't been initialized. So you probably have to note which methods the class calls on itself during construction, because the methods have to be prepared for the list never having been created as opposed to having assumed it was created.

Bill Venners: Or as you said, saying that these methods may be called by the constructor, so subclassers need to make sure calling the methods during construction will work.

Ken Arnold: Or the constructor needs to make sure that the normal preconditions for calling are, in fact, set up. Although you can't do that with the subclass's own data, which is where things get really complex.

The `clone` Dilemma

Bill Venners: In The Java Programming Language, you discuss clone and suggest how to use it. You have said to me previously that you think clone is broken.

Ken Arnold: I think clone is a real mess.

Bill Venners: But given clone's current state, how would you recommend people use it? What do you think of Bloch's suggestion in Effective Java to create copy constructors rather than using the clone approach?

Ken Arnold: The problem with a copy constructor is that you have to know the type of thing you're cloning or copying. You have to make a Foo given an old Foo that is passed into the copy constructor. But if the object the client holds is not a Foo, but is a Foo subtype, then you could end up doing type truncation, which you don't want to do.

Bill Venners: What do you mean by type truncation?

Ken Arnold: Suppose you have a Foo, which has a subclass Bar. If you ask Foo to clone itself and you have a Bar object, Foo will do all the Bar stuff. But if you use this copy constructor mechanism and you tell Foo to create a new Foo and pass in the old Bar, you will get a new Foo instead of a Bar. You lose the bottom of the type.

Bill Venners: You're fubar.

Ken Arnold: You're definitely fubar. So using a copy constructor implies a mechanism where you ask the passed object its type. You get its class object, and invoke its copy constructor by reflection. When you do that, you are way into ugliness.

If I were to be God at this point, and many people are probably glad I am not, I would say deprecate Cloneable and have a Copyable, because Cloneable has problems. Besides the fact that it's misspelled, Cloneable doesn't contain the clone method. That means you can't test if something is an instance of Cloneable, cast it to Cloneable, and invoke clone. You have to use reflection again, which is awful. That is only one problem, but one I'd certainly solve.

Bill Venners: So I'm writing a class today and I want to make it easy to clone or copy. What do I do?

Ken Arnold: If you are writing a class today, you have only a few reasonable interaction patterns. One is to go all the way. You implement the clone method; you declare it public, without throwing a CloneNotSupportedException if possible; and you implement clone. I say "if possible" because you may have to be able to throw CloneNotSupportedException. The clone method of the container classes, for example, needs to be able to throw CloneNotSupportedException, because although the container might be Cloneable, the objects it references might not be.

There is also a question of deep versus shallow copies. Do you want to copy the contents, or just the collection that refers to the same underlying objects? A shallow copy would give you two collections, each of which refers to the same underlying objects, rather than referring to new underlying objects. If your clone is deep, you might still have to allow CloneNotSupportedException in your clone method.

Another approach, which I don't prefer, is to pretend clone and Cloneable don't exist. However, you can't do that in industrial-strength classes. In industrial-strength classes, you would either make clone public, without throwing the exception if you can avoid it. Or you implement clone so that it will work if invoked, but you let each subclass decide if it should be public. Or you override clone with something that throws CloneNotSupportedException; in effect, stating you cannot clone this even in subclasses. The problem with ignoring clone is that nothing can stop a subclass from making it public and indirectly invoking your clone method. Then, if your clone method is inherited from Object -- in other words, if you have never written one -- and Object's clone implementation does the wrong thing, the user will have a corrupt class. It is better to notify users of that with CloneNotSupportedException. So in industrial-strength classes, I would pick one of these approaches. There is also this weird, almost surreal, approach where you can implement Cloneable and not make your clone method public, but I think you should never do that.

Bill Venners: So what about copy constructors? Do they have a time and place?

Ken Arnold: I am not fond of copy constructors. In fact, I'm not very fond of constructors at all. The problem is that the code that creates the object with a constructor is defining the object's type. In all other operations, the code that uses an object effectively only defines that the object is at least a certain type. Constructors are an exception to that rule. I don't think that exception should exist.

You can also think of it this way: new Foo should turn into an invocation of a static method that might create a subclass. The static method could look at the parameters and say: I will create a Foo subclass that is efficient for these kinds of parameters. At the point where you want a Foo, there are myriad reasons why the implementer of Foo may know you want a subclass, but you don't know it. And maybe you shouldn't know it, because next week a different set of decisions might make sense. The actual class of object that gets created is an implementation detail. You need something that is at least a Foo. You should go to the Foo class and say: I need something that is at least what you are, and here are the initialization parameters. Other languages will do that; of all things, Perl objects do that. I think this is a better solution. By calling a copy constructor on Foo, you are asking the Foo class to get a copy of this object that is at least a Foo, which is not the Foo class's business.

Appropriate Use of Marker Interfaces

Bill Venners: When are marker interfaces -- interfaces that lack methods -- appropriate?

Ken Arnold: I have this weird deep object view of that whole thing. I mentioned the term contract a couple times in this discussion. To me, object design is a design of contracts. Contracts define what you are allowed to rely on. Contracts are not expressible in anything less than a human language. You need to have someone sit down and say, if this is true then that happens. You have to spell out all the details.

Java has types as well as methods. There are contracts for entire types and contracts for methods. If you look at a method contract, you'll see that not everything can be spelled out in the programming language. Some things expressed in the contract must be expressed in human language. String's equalIgnoreCase is a method, but nothing in the programming language enforces that case is ignored; that requirement is only expressed in the human language text. The contract for the whole class or interface, then, says if you implement this interface, you will have, in general, the following kinds of behaviors.

To me, a marker interface isn't wrong. It is a general case in the sense that it has no methods, so the whole contract is associated with the class and says what things of that type do. As a degenerate case, a marker interface probably shouldn't occur often. But on the other hand, there are times when a marker interface is a legitimate way to express that if you implement this interface, then you are have certain relationships.

Serializable might be a good example. None of the methods related to serialization belong as methods in the Serializable interface itself. For example, readObject should not be a public method in the Serializable interface. It is an implementation detail. At the same time, Serializable is a behavioral contract that says you do expect your state shipped up in little bits and then those bits deconstructed into another copy of you. Serializable says that you have written the associated code to make sure serialization works right. So now I know I can use the object in a certain way--I can serialize and deserialize it. That is a real contract about an object that has a Serializable marker interface. I do think marker interfaces are rarely correct. But I don't think they are incorrect in principle.

I know people who are profligate with marker interfaces. They say: If the object you pass in has this marker interface, I will treat it in the following way. But I think that down that path lies darkness. Nobody knows the behavior of objects when they get their 17th marker interface. What if the object implements this marker interface and not this one, or this one and that one but not that other one? You'll send an object to me and I'll invoke a method, but because of some odd marker interface behavior, it won't do what I expect.

Nevertheless, contracts do exist where everything is outside the programming language. In those situations, you still want to let people know that the object behaves in a certain way. Marker interfaces let you do that.

Resources

Perfection and Simplicity, A Conversation with Ken Arnold, Part I:
http://www.artima.com/intv/perfect.html

Taste and Aesthetics, A Conversation with Ken Arnold, Part II:
http://www.artima.com/intv/taste.html

Designing Distributed Systems, A Conversation with Ken Arnold, Part III:
http://www.artima.com/intv/distrib.html

Sway with JavaSpaces, A Conversation with Ken Arnold, Part IV:
http://www.artima.com/intv/sway.html

JavaSpaces: Data, Decoupling, and Iteration, A Conversation with Ken Arnold, Part V:
http://www.artima.com/intv/decouple.html

You can obtain information about Linda from here:
http://www.cs.yale.edu/Linda/linda.html

Ken Arnold first mentioned idempotency in Part III of this interview:
http://www.artima.com/intv/distrib.html

JavaSpaces: Principles, Patterns, and Practice by Eric Freeman, Susanne Hupfer, and Ken Arnold, the book from which Bill Venners reads quotes in this article, is at Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/0201309556/

The Jini Community, the central site for signers of the Jini Sun Community Source License to interact:
http://www.jini.org

Download JavaSpaces from:
http://java.sun.com/products/javaspaces/

Design objects for people, not for computers:
http://www.artima.com/apidesign/object.html

Make Room for JavaSpaces, Part I - An introduction to JavaSpaces, a simple and powerful distributed programming tool:
http://www.artima.com/jini/jiniology/js1.html

Make Room for JavaSpaces, Part II - Build a compute server with JavaSpaces, Jini's coordination service:
http://www.artima.com/jini/jiniology/js2.html

Make Room for JavaSpaces, Part III - Coordinate your Jini applications with JavaSpaces:
http://www.artima.com/jini/jiniology/js3.html

Make Room for JavaSpaces, Part IV - Explore Jini transactions with JavaSpaces:
http://www.artima.com/jini/jiniology/js4.html

Make Room for JavaSpaces, Part V - Make your compute server robust and scalable with Jini and JavaSpaces:
http://www.artima.com/jini/jiniology/js5.html

Make Room for JavaSpaces, Part VI - Build and use distributed data structures in your JavaSpaces programs:
http://www.artima.com/jini/jiniology/js6.html

Talk back!

Have an opinion? Readers have already posted 22 comments about this article. Why not add yours?

About the author

Bill Venners is president of Artima Software, Inc. and editor-in-chief of Artima.com. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project that produced the ServiceUI API. The ServiceUI became the de facto standard way to associate user interfaces to Jini services, and was the first Jini community standard approved via the Jini Decision Process. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community. He currently devotes most of his energy to building Artima.com into an ever more useful resource for developers.