Object Finalization and Cleanup

How to Design Classes for Proper Object Cleanup

by Bill Venners
May 15, 1998

First published in JavaWorld, May 1998
Summary
This installment of the Design Techniques column discusses the design guidelines that pertain to the end of an object's life. I give an overview of the rules of garbage collection, discuss finalizers, and suggest ways to design objects such that finite resources aren't monopolized.

Three months ago, I began a mini-series of articles about designing objects with a discussion of design principles that focused on proper initialization at the beginning of an object's life. In this Design Techniques article, I'll be focusing on the design principles that help you ensure proper cleanup at the end of an object's life.

Why clean up?
Every object in a Java program uses computing resources that are finite. Most obviously, all objects use some memory to store their images on the heap. (This is true even for objects that declare no instance variables. Each object image must include some kind of pointer to class data, and can include other implementation-dependent information as well.) But objects may also use other finite resources besides memory. For example, some objects may use resources such as file handles, graphics contexts, sockets, and so on. When you design an object, you must make sure it eventually releases any finite resources it uses so the system won't run out of those resources.

Because Java is a garbage-collected language, releasing the memory associated with an object is easy. All you need to do is let go of all references to the object. Because you don't have to worry about explicitly freeing an object, as you must in languages such as C or C++, you needn't worry about corrupting memory by accidentally freeing the same object twice. You do, however, need to make sure you actually release all references to the object. If you don't, you can end up with a memory leak, just like the memory leaks you get in a C++ program when you forget to explicitly free objects. Nevertheless, so long as you release all references to an object, you needn't worry about explicitly "freeing" that memory.

Similarly, you needn't worry about explicitly freeing any constituent objects referenced by the instance variables of an object you no longer need. Releasing all references to the unneeded object will in effect invalidate any constituent object references contained in that object's instance variables. If the now-invalidated references were the only remaining references to those constituent objects, the constituent objects will also be available for garbage collection. Piece of cake, right?

The rules of garbage collection
Although garbage collection does indeed make memory management in Java a lot easier than it is in C or C++, you aren't able to completely forget about memory when you program in Java. To know when you may need to think about memory management in Java, you need to know a bit about the way garbage collection is treated in the Java specifications.

Garbage collection is not mandated
The first thing to know is that no matter how diligently you search through the Java Virtual Machine Specification (JVM Spec), you won't be able to find any sentence that commands, Every JVM must have a garbage collector. The Java Virtual Machine Specification gives VM designers a great deal of leeway in deciding how their implementations will manage memory, including deciding whether or not to even use garbage collection at all. Thus, it is possible that some JVMs (such as a bare-bones smart card JVM) may require that programs executed in each session "fit" in the available memory.

Of course, you can always run out of memory, even on a virtual memory system. The JVM Spec does not state how much memory will be available to a JVM. It just states that whenever a JVM does run out of memory, it should throw an OutOfMemoryError.

Nevertheless, to give Java applications the best chance of executing without running out of memory, most JVMs will use a garbage collector. The garbage collector reclaims the memory occupied by unreferenced objects on the heap, so that memory can be used again by new objects, and usually de-fragments the heap as the program runs.

Garbage collection algorithm is not defined
Another command you won't find in the JVM specification is All JVMs that use garbage collection must use the XXX algorithm. The designers of each JVM get to decide how garbage collection will work in their implementations. Garbage collection algorithm is one area in which JVM vendors can strive to make their implementation better than the competition's. This is significant for you as a Java programmer for the following reason:

Because you don't generally know how garbage collection will be performed inside a JVM, you don't know when any particular object will be garbage collected.

So what? you might ask. The reason you might care when an object is garbage collected has to do with finalizers. (A finalizer is defined as a regular Java instance method named finalize() that returns void and takes no arguments.) The Java specifications make the following promise about finalizers:

Before reclaiming the memory occupied by an object that has a finalizer, the garbage collector will invoke that object's finalizer.

Given that you don't know when objects will be garbage collected, but you do know that finalizable objects will be finalized as they are garbage collected, you can make the following grand deduction:

You don't know when objects will be finalized.

You should imprint this important fact on your brain and forever allow it to inform your Java object designs.

Finalizers to avoid
The central rule of thumb concerning finalizers is this:

Don't design your Java programs such that correctness depends upon "timely" finalization.

In other words, don't write programs that will break if certain objects aren't finalized by certain points in the life of the program's execution. If you write such a program, it may work on some implementations of the JVM but fail on others.

Don't rely on finalizers to release non-memory resources
An example of an object that breaks this rule is one that opens a file in its constructor and closes the file in its finalize() method. Although this design seems neat, tidy, and symmetrical, it potentially creates an insidious bug. A Java program generally will have only a finite number of file handles at its disposal. When all those handles are in use, the program won't be able to open any more files.

A Java program that makes use of such an object (one that opens a file in its constructor and closes it in its finalizer) may work fine on some JVM implementations. On such implementations, finalization would occur often enough to keep a sufficient number of file handles available at all times. But the same program may fail on a different JVM whose garbage collector doesn't finalize often enough to keep the program from running out of file handles. Or, what's even more insidious, the program may work on all JVM implementations now but fail in a mission-critical situation a few years (and release cycles) down the road.

Other finalizer rules of thumb
Two other decisions left to JVM designers are selecting the thread (or threads) that will execute the finalizers and the order in which finalizers will be run. Finalizers may be run in any order -- sequentially by a single thread or concurrently by multiple threads. If your program somehow depends for correctness on finalizers being run in a particular order, or by a particular thread, it may work on some JVM implementations but fail on others.

You should also keep in mind that Java considers an object to be finalized whether the finalize() method returns normally or completes abruptly by throwing an exception. Garbage collectors ignore any exceptions thrown by finalizers and in no way notify the rest of the application that an exception was thrown. If you need to ensure that a particular finalizer fully accomplishes a certain mission, you must write that finalizer so that it handles any exceptions that may arise before the finalizer completes its mission.

One more rule of thumb about finalizers concerns objects left on the heap at the end of the application's lifetime. By default, the garbage collector will not execute the finalizers of any objects left on the heap when the application exits. To change this default, you must invoke the runFinalizersOnExit() method of class Runtime or System, passing true as the single parameter. If your program contains objects whose finalizers must absolutely be invoked before the program exits, be sure to invoke runFinalizersOnExit() somewhere in your program.

So what are finalizers good for?
By now you may be getting the feeling that you don't have much use for finalizers. While it is likely that most of the classes you design won't include a finalizer, there are some reasons to use finalizers.

One reasonable, though rare, application for a finalizer is to free memory allocated by native methods. If an object invokes a native method that allocates memory (perhaps a C function that calls malloc()), that object's finalizer could invoke a native method that frees that memory (calls free()). In this situation, you would be using the finalizer to free up memory allocated on behalf of an object -- memory that will not be automatically reclaimed by the garbage collector.

Another, more common, use of finalizers is to provide a fallback mechanism for releasing non-memory finite resources such as file handles or sockets. As mentioned previously, you shouldn't rely on finalizers for releasing finite non-memory resources. Instead, you should provide a method that will release the resource. But you may also wish to include a finalizer that checks to make sure the resource has already been released, and if it hasn't, that goes ahead and releases it. Such a finalizer guards against (and hopefully will not encourage) sloppy use of your class. If a client programmer forgets to invoke the method you provided to release the resource, the finalizer will release the resource if the object is ever garbage collected. The finalize() method of the LogFileManager class, shown later in this article, is an example of this kind of finalizer.

Avoid finalizer abuse
The existence of finalization produces some interesting complications for JVMs and some interesting possibilities for Java programmers. For a discussion of the impact of finalizers on JVMs, see the sidebar, a short excerpt from chapter 9, "Garbage Collection," of my book, Inside the Java Virtual Machine.

What finalization grants to programmers is power over the life and death of objects. In short, it is possible and completely legal in Java to resurrect objects in finalizers -- to bring them back to life by making them referenced again. (One way a finalizer could accomplish this is by adding a reference to the object being finalized to a static linked list that is still "live.") Although such power may be tempting to exercise because it makes you feel important, the rule of thumb is to resist the temptation to use this power. In general, resurrecting objects in finalizers constitutes finalizer abuse.

The main justification for this rule is that any program that uses resurrection can be redesigned into an easier-to-understand program that doesn't use resurrection. A formal proof of this theorem is left as an exercise to the reader (I've always wanted to say that), but in an informal spirit, consider that object resurrection will be as random and unpredictable as object finalization. As such, a design that uses resurrection will be difficult to figure out by the next maintenance programmer who happens along -- who may not fully understand the idiosyncrasies of garbage collection in Java.

If you feel you simply must bring an object back to life, consider cloning a new copy of the object instead of resurrecting the same old object. The reasoning behind this piece of advice is that garbage collectors in the JVM invoke the finalize() method of an object only once. If that object is resurrected and becomes available for garbage collection a second time, the object's finalize() method will not be invoked again.

Managing non-memory resources
Because heap memory is automatically reclaimed by the garbage collector, the main thing you need to worry about when you design an object's end-of-lifetime behavior is to ensure that finite non-memory resources, such as file handles or sockets, are released. You can take any of three basic approaches when you design an object that needs to use a finite non-memory resource:

  1. Obtain and release the resource within each method that needs the resource
  2. Provide a method that obtains the resource and another that releases it
  3. Obtain the resource at creation time and provide a method that releases it

Approach 1: Obtain and release within each relevant method
As a general rule, the releasing of non-memory finite resources should be done as soon as possible after their use because the resources are, by definition, finite. If possible, you should try to obtain a resource, use it, then release it all within the method that needs the resource.

A log file class: An example of Approach 1
An example of a class where Approach 1 might make sense is a log file class. Such a class takes care of formatting and writing log messages to a file. The name of the log file is passed to the object as it is instantiated. To write a message to the log file, a client invokes a method in the log file class, passing the message as a String. Here's an example:

import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.io.IOException;

class LogFile {

    private String fileName;

    LogFile(String fileName) {
        this.fileName = fileName;
    }

    // The writeToFile() method will catch any IOException
    // so that clients aren't forced to catch IOException
    // everywhere they write to the log file.  For now,
    // just fail silently. In the future, could put
    // up an informative non-modal dialog box that indicates
    // a logging error occurred. - bv 4/15/98
    void writeToFile(String message) {

        FileOutputStream fos = null;
        PrintWriter pw = null;

        try {
            fos = new FileOutputStream(fileName, true);
            try {
                pw = new PrintWriter(fos, false);

                pw.println("------------------");
                pw.println(message);
                pw.println();
            }
            finally {
                if (pw != null) {
                    pw.close();
                }
            }
        }
        catch (IOException e) {
        }
        finally {
            if (fos != null) {
                try {
                    fos.close();
                }
                catch (IOException e) {
                }
            }
        }
    }
}

Class LogFile is a simple example of Approach 1. A more production-ready LogFile class might do things such as:

  • Insert the date and time each log message was written
  • Allow messages to be assigned a level of importance (such as ERROR, INFO, or DEBUG) and enable a level to be set that will prevent unwanted detail (such as DEBUG messages) from making it into the log file
  • Manage in some way the size of the log file, i.e., by copying it to a different filename and starting fresh each time the log file achieves a certain size

The main feature of this simple version of class LogFile is that it surrounds each log message with a series of dashes and a blank line.

Using finally to ensure resource release
Note that in the writeToFile() method, the releasing of the resource is done in finally clauses. This is to make sure the finite resource (file handle) is actually released no matter how the code is exited. If an IOException is thrown, the file will be closed.

Pros and cons of Approach 1
The approach to resource management taken by class LogFile (Approach 1 from the above list) helps make your class easy to use, because client programmers don't have to worry about explicitly obtaining or releasing the resource. In both Approach 2 and 3 from the list above client programmers must remember to explicitly invoke a method to release the resource. In addition -- and what can be far more difficult -- client programmers must figure out when their programs no longer need a resource.

A problem with Approach 1 is that obtaining and releasing the resource each time you need it may be too inefficient. Another problem is that, in some situations, you may need to hold onto the resource between invocations of methods that use the resource (such as writeToFile()), so no other object can have access to it. In such cases, one of the other two approaches is preferable.

Approach 2: Offer methods for obtaining and releasing resources
In Approach 2 from the list above, you provide one method for obtaining the resource and another method for releasing it. This approach enables the same class instance to obtain and release a resource multiple times. Here's an example:

import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.io.IOException;

class LogFileManager {

    private FileOutputStream fos;
    private PrintWriter pw;
    private boolean logFileOpen = false;

    LogFileManager() {
    }

    LogFileManager(String fileName) throws IOException {
        openLogFile(fileName);
    }

    void openLogFile(String fileName) throws IOException {
        if (!logFileOpen) {
            try {
                fos = new FileOutputStream(fileName, true);
                pw = new PrintWriter(fos, false);
                logFileOpen = true;
            }
            catch (IOException e) {
                if (pw != null) {
                    pw.close();
                    pw = null;
                }
                if (fos != null) {
                    fos.close();
                    fos = null;
                }
                throw e;
            }
        }
    }

    void closeLogFile() throws IOException {
        if (logFileOpen) {
            pw.close();
            pw = null;
            fos.close();
            fos = null;
            logFileOpen = false;
        }
    }

    boolean isOpen() {
        return logFileOpen;
    }

    void writeToFile(String message) throws IOException {

        pw.println("------------------");
        pw.println(message);
        pw.println();
    }

    protected void finalize() throws Throwable {
        if (logFileOpen) {
            try {
                closeLogFile();
            }
            finally {
                super.finalize();
            }
        }
    }
}

In this example, class LogFileManager declares methods openLogFile() and closeLogFile(). Given this design, you could write to multiple log files with one instance of this class. This design also allows a client to monopolize the resource for as long as it wants. A client can write several consecutive messages to the log file without fear that another thread or process will slip in any intervening messages. Once a client successfully opens a log file with openLogFile(), that log file belongs exclusively to that client until the client invokes closeLogFile().

Note that LogFileManager uses a finalizer as a fallback in case a client forgets to invoke closeLogFile(). As mentioned earlier in this article, this is one of the more common uses of finalizers.

Note also that after invoking closeLogFile(), LogFileManager's finalizer invokes super.finalize(). Invoking superclass finalizers is good practice in any finalizer, even in cases (such as this) where no superclass exists other than Object. The JVM does not automatically invoke superclass finalizers, so you must do so explicitly. If someone ever inserts a class that declares a finalizer between LogFileManager and Object in the inheritance hierarchy, the new object's finalizer will already be invoked by LogFileManager's existing finalizer.

Making super.finalize() the last action of a finalizer ensures that subclasses will be finalized before superclasses. Although in most cases the placement of super.finalize() won't matter, in some rare cases, a subclass finalizer may require that its superclass be as yet unfinalized. So, as a general rule of thumb, place super.finalize() last.

Approach 3: Claim resource on creation, offer method for release
In the last approach, Approach 3 from the above list, the object obtains the resource upon creation and declares a method that releases the resource. Here's an example:

import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.io.IOException;

class LogFileTransaction {

    private FileOutputStream fos;
    private PrintWriter pw;
    private boolean logFileOpen = false;

    LogFileTransaction(String fileName) throws IOException {
        try {
            fos = new FileOutputStream(fileName, true);
            pw = new PrintWriter(fos, false);
            logFileOpen = true;
        }
        catch (IOException e) {
            if (pw != null) {
                pw.close();
                pw = null;
            }
            if (fos != null) {
                fos.close();
                fos = null;
            }
            throw e;
        }
    }

    void closeLogFile() throws IOException {
        if (logFileOpen) {
            pw.close();
            pw = null;
            fos.close();
            fos = null;
            logFileOpen = false;
        }
    }

    boolean isOpen() {
        return logFileOpen;
    }

    void writeToFile(String message) throws IOException {

        pw.println("------------------");
        pw.println(message);
        pw.println();
    }

    protected void finalize() throws Throwable {
        if (logFileOpen) {
            try {
                closeLogFile();
            }
            finally {
                super.finalize();
            }
        }
    }
}

This class is called LogFileTransaction because every time a client wants to write a chunk of messages to the log file (and then let others use that log file), it must create a new LogFileTransaction. Thus, this class models one transaction between the client and the log file.

One interesting thing to note about Approach 3 is that this is the approach used by the FileOutputStream and PrintWriter classes used by all three example log file classes. In fact, if you look through the java.io package, you'll find that almost all of the java.io classes that deal with file handles use Approach 3. (The two exceptions are PipedReader and PipedWriter, which use Approach 2.)

Conclusion
The most important point to take away from this article is that if a Java object needs to take some action at the end of its life, no automatic way exists in Java that will guarantee that action is taken in a timely manner. You can't rely on finalizers to take the action, at least not in a timely way. You will need to provide a method that performs the action and encourage client programmers to invoke the method when the object is no longer needed.

This article contained several guidelines that pertain to finalizers:

  • Don't design your Java programs such that correctness depends on "timely" finalization
  • Don't assume that a finalizer will be run by any particular thread
  • Don't assume that finalizers will be run in any particular order
  • Avoid designs that require finalizers to resurrect objects; if you must use resurrection, prefer cloning over straight resurrection
  • Remember that exceptions thrown by finalizers are ignored
  • If your program includes objects with finalizers that absolutely must be run before the program exits, invoke runFinalizersOnExit(true) in class Runtime or System
  • Unless you are writing the finalizer for class Object, always invoke super.finalize() at the end of your finalizers

Next month
In next month's Design Techniques I'll continue the mini-series of articles that focus on designing classes and objects. Next month's article, the fifth of this mini-series, will discuss when to use -- and when not to use -- exceptions.

A request for reader participation
Software design is subjective. Your idea of a well-designed program may be your colleague's maintenance nightmare. In light of this fact, I am trying to make this column as interactive as possible.

I encourage your comments, criticisms, suggestions, flames -- all kinds of feedback -- about the material presented in this column. If you disagree with something, or have something to add, please let me know.

You can either participate in a discussion forum devoted to this material or e-mail me directly at bv@artima.com.

Resources

This article was first published under the name Object Finalization and Cleanup in JavaWorld, a division of Web Publishing, Inc., May 1998.

Sidebar: Finalization and Garbage Collection
The following text, which describes the impact of finalization on the garbage collection activities of the JVM, is an excerpt from chapter 9, "Garbage Collection," of Inside the Java Virtual Machine by Bill Venners. It is reprinted here with permission from McGraw-Hill:

Finalization
In Java, an object may have a finalizer: a method that the garbage collector must run on the object prior to freeing the object. The potential existence of finalizers complicates the job of any garbage collector in a Java virtual machine.

To add a finalizer to a class, you simply declare a method in that class as follows:

// On CD-ROM in file gc/ex2/Example2.java
class Example2 {

    protected void finalize() throws Throwable {
        //...
        super.finalize();
    }
    //...
}

A garbage collector must examine all objects it has discovered to be unreferenced to see if any include a finalize() method.

Because of finalizers, a garbage collector in the Java virtual machine must perform some extra steps each time it garbage collects. First, the garbage collector must in some way detect unreferenced objects (call this "Pass I"). Then, it must examine the unreferenced objects it has detected to see if any declare a finalizer. If it has enough time, it may at this point in the garbage collection process finalize all unreferenced objects that declare finalizers.

After executing all finalizers, the garbage collector must once again detect unreferenced objects starting with the root nodes (call this "Pass II"). This step is needed because finalizers can "resurrect" unreferenced objects and make them referenced again. Finally, the garbage collector can free all objects that were found to be unreferenced in both Passes I and II.

To reduce the time it takes to free up some memory, a garbage collector can optionally insert a step between the detection of unreferenced objects that have finalizers and the running of those finalizers. Once the garbage collector has performed Pass I and found the unreferenced objects that need to be finalized, it can run a miniature trace starting not with the root nodes but with the objects waiting to be finalized. Any objects that are (1) not reachable from the root nodes (those detected during Pass I) and (2) not reachable from the objects waiting to be finalized cannot be resurrected by any finalizer. These objects can be freed immediately.

If an object with a finalizer becomes unreferenced, and its finalizer is run, the garbage collector must in some way ensure that it never runs the finalizer on that object again. If that object is resurrected by its own finalizer or some other object's finalizer and later becomes unreferenced again, the garbage collector must treat it as an object that has no finalizer.

As you program in Java, you must keep in mind that it is the garbage collector that runs finalizers on objects. Because it is not generally possible to predict exactly when unreferenced objects will be garbage collected, it is not possible to predict when object finalizers will be run. As mentioned in Chapter 2, "Platform Independence," you should avoid writing programs for which correctness depends upon the timely finalization of objects. For example, if a finalizer of an unreferenced object releases a resource that is needed again later by the program, the resource will not be made available until after the garbage collector has run the object finalizer. If the program needs the resource before the garbage collector has gotten around to finalizing the unreferenced object, the program is out of luck.

Back to story

Talk back!

Have an opinion? Be the first to post a comment about this article.

About the author

Bill Venners has been writing software professionally for 12 years. Based in Silicon Valley, he provides software consulting and training services under the name Artima Software Company. Over the years he has developed software for the consumer electronics, education, semiconductor, and life insurance industries. He has programmed in many languages on many platforms: assembly language on various microprocessors, C on Unix, C++ on Windows, Java on the Web. He is author of the book: Inside the Java Virtual Machine, published by McGraw-Hill.