Canonical Object Idiom

Defining a Baseline Set of Functionality for Objects

by Bill Venners
September 15, 1998

First published in JavaWorld, September 1998
Summary
In this installment of the Design Techniques column, I propose "the canonical object" as a Java idiom. The article discusses the fundamental services that all objects in general should offer, shows how objects can offer these services, and names such objects "canonical."

In last month's installment of Design Techniques, I proposed the "event generator" as a Java idiom. This month I will propose another idiom, which I am calling the "canonical object."

I call this object idiom "canonical" because it represents the simplest form an object should take. The idea behind this idiom is to suggest a baseline functionality that you give by default to any object you design. You may adopt this idiom in your own programming practice, or, if you are developing a set of style guidelines for a team or organization, you may want to make the idiom part of those guidelines. The intent of the idiom is not to force programmers to endow every object with this baseline functionality, but simply to define a default functionality for every object. In other words, programmers would be encouraged to make a new object canonical unless they have specific reasons to depart from the idiom in a given, specific case.

A fundamental tenet of object-oriented programming is that you can treat an instance of a subclass as if it were an instance of a superclass. Because all classes in Java are subclasses of java.lang.Object, it follows that another basic tenet of Java programming is that you can treat any object as, well, an Object.

As a Java programmer, you can treat an object as an Object in many ways. When you invoke any of the methods declared in class java.lang.Object on an object, for example, you are treating that object as an Object. These methods, some of which are clone(), equals(), toString(), wait(), notify(), finalize(), getClass(), and hashCode(), provide basic services commonly exercised on Java objects of all kinds. In addition, you may find yourself wanting to serialize just about any kind of object. All of these activities are ways you may treat objects of many diverse classes simply and uniformly as Objects.

Taken together, all the activities that are commonly performed on Java objects constitute a set of services that, in most cases, should be built into every class you design. The question raised by this article and answered by the idiom is: What do you need to do, at a minimum, to make each object you define support the services commonly expected of all objects?

This idiom defines a baseline set of requirements for object definitions and names objects that implement the baseline "canonical objects."

Recipe: How to concoct a canonical object
You can turn instances of any class into canonical objects by taking the following steps with the class:

  1. Implement Cloneable (unless a superclass already implements it or the object is immutable).
  2. If the class includes instance variables that may at some point in the lifetime of its instances hold references to mutable objects, override clone().
  3. Override equals() and hashCode().
  4. Implement Serializable (unless a superclass already implements it).

Example code
Here's a Java class that illustrates the canonical object idiom:

// In file canonical/ex1/Worker.java
import java.io.Serializable;
import java.util.Vector;

public class Worker
    implements Cloneable, Serializable {

    private String name;
    private Vector doList;

    public Worker(String name, Vector doList) {
        if (name == null || doList == null) {
            throw new IllegalArgumentException();
        }
        this.name = name;
        this.doList = doList;
    }

    public Worker(String name) {
        this(name, new Vector());
    }

    public void setName(String name) {
        if (name == null) {
            throw new IllegalArgumentException();
        }
        this.name = name;
    }

    public void addtoList(Object job) {
        doList.addElement(job);
    }

    public Object clone() {

        // Do the basic clone
        Worker theClone = null;
        try {
            theClone = (Worker) super.clone();
        }
        catch (CloneNotSupportedException e) {
            // Should never happen
            throw new InternalError(e.toString());
        }

        // Clone mutable members
        theClone.doList = (Vector) doList.clone();
        return theClone;
    }

    public boolean equals(Object o) {

        if (o == null) {
            return false;
        }

        Worker w;
        try {
            w = (Worker) o;
        }
        catch (ClassCastException e) {
            return false;
        }

        if (name.equals(w.name) && doList.equals(w.doList)) {
            return true;
        }
        return false;
    }

    //...
}

In the code listing above, instances of class Worker are canonical objects because the Worker objects are ready for (1) cloning, (2) serialization, and (3) semantic comparison with equals. To make Worker objects ready for cloning, class Worker implements Cloneable. Implementing Cloneable is necessary in this case because Worker objects are mutable and Cloneable isn't implemented by any superclass. Likewise, to make Worker objects ready for serialization, class Worker implements Serializable. Because no superclass of Worker implements Serializable, class Worker itself must implement it. Lastly, class Worker, like any other class with canonical instances are canonical objects, overrides equals() with a method that does an appropriate semantic comparison of the two objects.

The value of the canonical object idiom
The canonical object idiom can be useful to you in several ways. First, this idiom can guide you when you are deciding whether to support cloning or serialization in a particular class, and which java.lang.Object methods, if any, you should override in that class. You can use this idiom as a starting point with each class you define, and depart from the idiom only if you feel special circumstances justify such a departure. In addition, knowledge of this idiom should make your fellow programmers feel a vague sense of guilt in their attempts to avoid thinking about these issues when they design a class. And, hopefully, this guilt will encourage your coworkers to design objects that support the baseline object services defined by the canonical object idiom, and that should make their objects a bit easier for you to use. Finally, one promising use of this idiom is that it can serve as a starting point for discussion when formulating Java coding standards for a project or organization.

Implementation guidelines
Here are some guidelines to help you make the most of the canonical object idiom:

Make objects canonical by default
In general, you should implement the canonical object idiom in every Java class you define, unless you have a specific reason not to. Although you may not be able to imagine why someone would want to use a particular class of objects in some of these ways, you have likely met coworkers who are capable of surprising you in how they use your classes. Besides, predicting the future is a difficult business. One of these days even you may reuse your classes in some ways you didn't imagine when you first designed the class.

The benefit of canonical objects is that they are more flexible (easy to understand, use, and change) than their non-canonical brethren. Canonical objects help make code flexible because they are ready to be manipulated in the ways objects of any type are commonly manipulated. By now you know that canonical objects can be cloned, serialized, and semantically compared with equals, but they can also be used in other common ways. Invoking toString() on a canonical object will yield a reasonable result provided by the default implementation of toString() in superclass Object. Likewise, hashCode() works properly thanks to Object's default implementation. getClass() returns a reference to the appropriate Class instance, and even the wait() and notify() methods work. Everything works. Canonical objects are ready to do what you want them to do.

Catch CloneNotSupportedException
The customary first step in any implementation of clone() is to invoke the superclass's implementation of clone(). If you are writing a clone() method for a direct subclass of class Object, you will need to either catch CloneNotSupportedException or declare it in your throws clause. If you forget to do either of these two things, the compiler will dutifully inform you of your negligence.

So, given that the compiler will force you to deal with CloneNotSupportedException in one way or the other, which way should you deal with it? In general, you should catch CloneNotSupportedException and throw some kind of unchecked exception in the catch clause, the approach demonstrated by the Worker class. Why? Because if you declare CloneNotSupportedException in your throws clause, anyone who wants to clone your object will need to deal with the exception -- either by catching it or declaring it in their throws clause. And you don't want to bother clients of your class with all that hard decision-making just because they want to clone your object.

It turns out that, so long as you implement Cloneable, Object's implementation of clone() will never throw CloneNotSupportedException. Object's implementation of clone() checks to see if the object's class implements Cloneable; if it does, it clones the object by making a direct field-by-field copy of the original in the clone. Only if the object's class doesn't implement Cloneable will Object's implementation of clone() throw CloneNotSupportedException. So if you implement Cloneable, you may as well catch CloneNotSupportedException just to keep it out of your clone()'s throws clause.

The one risk to heeding this advice is that when you remove CloneNotSupportedException from your clone()'s throws clause, you tie the hands of anyone who ever wants to disallow cloning in a subclass of your class. The customary way to disallow cloning in a subclass of some class that allows and supports cloning is to override clone() and throw CloneNotSupportedException. Thus, you should consider whether you want to enable subclasses to disallow cloning when you implement clone().

My opinion is that if you are not sure, you should catch CloneNotSupportedException, which effectively sets the policy that all subclasses will be clonable. I believe situations in which someone will want to disallow cloning in a subclass will be rare. Therefore, the ease of use you gain by not forcing clients to deal with CloneNotSupportedException outweighs the slight risk that you will be frustrating someone who wants to disallow cloning in a subclass at some point in the future.

Don't support cloning in immutable objects
If the object is immutable, you don't need to (and shouldn't) make it clonable. The reason you clone an object is so that the two instances can evolve independently thereafter. For example, you may clone an object before passing it to a method that alters the object. Because immutable objects can't evolve (their state doesn't ever change), there is no need to clone them. Everyone can safely share the same immutable instance.

Make equals() do a semantic compare
An important aspect of the canonical object idiom is implementing equals() such that it does a semantic comparison. Canonical objects override equals(), because the default implementation of equals() in class Object just returns true if one object '==' the other object. In other words, comparing two objects with equals() yields the same result, by default, as comparing to objects with Java's == operator. Why are there two ways to check objects for equality? Because they are supposed to be different.

Java's == operator simply checks to see if two references refer to the same object exactly. Invoking equals() on an object is supposed to do a semantic compare: if the two objects "mean the same thing," equals() should return true.

What does it mean for an object to "mean the same thing" as another object? Well, that's what you, as designer of a class, get to decide. In general, however, two objects are semantically equal when they have the same class and their states are equal. In other words, semantic equality means that:

  • Both objects have the same class.
  • Corresponding instance variables with primitive types are equivalent (as reported by ==).
  • Corresponding instance variables with reference types are either both null or semantically equivalent (as reported by equals()).

For a bit more help on deciding how you should define equals(), consider that any implementation of equals() should have the following properties:

  • For any object reference a, a.equals(null) should return false.
  • For any object reference value a, a.equals(a) should return true. (equals() is reflexive.)
  • For any object references a and b, a.equals(b) should return true if and only if b.equals(a) returns true. (equals() is symmetric.)
  • For any object references a, b, and c, if a.equals(b) returns true and b.equals(c) returns true, then a.equals(c) should return true. (equals() is transitive.)
  • For any object references a and b, multiple invocations of a.equals(b) consistently return true or consistently return false. (equals() is consistent.)

Override hashCode()
Whenever you override equals(), you should override hashCode(). hashCode() should return the same hash value for any two objects that are semantically equal, as determined by equals().

Alternative ways to implement the idiom
If you wish to disallow cloning of an object, you can simply choose not to implement Cloneable, unless a superclass already implements Cloneable. In that case, you'll need to override clone() and throw CloneNotSupportedException. If a superclass implementation of clone() has removed CloneNotSupportedException from its throws clause, you should either change that superclass or allow cloning in the subclass.

If you wish to disallow serialization, you can simply choose not to implement Serializable.

Note that defining finalize() is not part of this idiom. A finalizer is not appropriate in general cases, although under certain circumstances you may want to write a finalizer. For advice on writing finalizers, follow the link from the Resources section to a previous Design Techniques article on that subject.

Note also that defining toString() is missing from the list above. I left it out because I believe Object has a reasonable default implementation for this method. Object's toString() method returns a string composed of the name of the class, an "at" sign ("@"), and the unsigned hexadecimal representation of the hash code of the object. If you do override toString, you should return a string that "textually represents" the object. The returned result should be concise, informative, and easy to read.

One other thing missing from the canonical object idiom is a no-arg constructor. Any class that has a no-arg constructor and implements Serializable is a JavaBean. See the Resources section for a link to a discussion of when it is appropriate to make a class into a bean.

Idiom issues
To help spark some discussion on the Flexible Java Forum, a discussion forum devoted to Java design topics, I will throw out some of the issues that may present themselves with this idiom (see the Resources section for a link to the forum):

  • Is "canonical object" the best name?
  • Am I justified in excluding a requirement to override toString from my canonical object recipe?
  • Should I throw something besides InternalError from the catch clause that catches CloneNotSupportedException? My general advice (given in my "Designing with Exceptions" article) is that programs should throw only exceptions, never errors. Usually the VM throws the errors. But in this case, I have implemented Cloneable. Thus, if Object's clone() implementation throws CloneNotSupported, I think that may qualify as an internal error to me. Since this "internal error" will likely never happen, what I throw probably isunimportant. But I'd still feel better about throwing an exception rather than an error. Perhaps what we need is a java.lang.ThisWasntSupposedToEverHappenException.
  • I'm a bit concerned about promoting the catching of CloneNotSupportedException at all, because, as I described above, it does tie the hands of anyone wishing to disallow cloning in a subclass.
  • Of course, there is that nagging question of the no-arg constructor, which, if it is part of the canonical object idiom, would make every canonical object into a JavaBean. This question has its own forum topic; you can find a link to it in the Resources section.

Next month
In next month's Design Techniques I'll talk about composition and inheritance.

A request for reader participation
I encourage your comments, criticisms, suggestions, flames -- all kinds of feedback -- about the material presented in this column. If you disagree with something, or have something to add, please let me know.

You can either participate in a discussion forum devoted to this material or e-mail me directly at bv@artima.com.

Resources

This article was first published under the name The Canonical Object Idiom in JavaWorld, a division of Web Publishing, Inc., September 1998.

Talk back!

Have an opinion? Be the first to post a comment about this article.

About the author

Bill Venners has been writing software professionally for 12 years. Based in Silicon Valley, he provides software consulting and training services under the name Artima Software Company. Over the years he has developed software for the consumer electronics, education, semiconductor, and life insurance industries. He has programmed in many languages on many platforms: assembly language on various microprocessors, C on Unix, C++ on Windows, Java on the Web. He is author of the book: Inside the Java Virtual Machine, published by McGraw-Hill.