The Hotspot Virtual Machine

How Hotspot Can Improve Java Program Performance and Designs

by Bill Venners
May 15, 1998

First published in Developer.com, May 1998
Summary
This article describes the "popular lore" that aims to improve the performance of Java programs in ways that reduce the flexibility of the code, and describes Sun's Hotspot JVM in technical detail. The article shows how Hotspot attempts to eliminate the performance bottlenecks that gave rise to the "popular lore" in the first place.

According to Sun Microsystems, the Hotspot virtual machine, Sun's next-generation Java virtual machine (JVM), promises to make Java "as fast as C++." Specifically, Sun says that a platform-independent Java program delivered as bytecodes in class files will run on Hotspot at speeds on par with an equivalent C++ program compiled to a native executable. If Sun is able to make good on this claim, the current performance penalty of delivering or using a Java program will go away, at least as compared to C++. But Hotspot has one other implication important to developers: Hotspot promises to alleviate concerns that creating good object-oriented, thread-safe designs and good-quality, maintainable code will degrade performance.

To date, one of the major tradeoffs of selecting Java over other languages has been performance. The original Java virtual machines, which simply interpreted bytecodes one at a time, were clocked at 30 to 50 times slower than natively compiled C++. Just-in-time (JIT) compilers have narrowed the performance gap to 3 to 10 times slower than C++, but that performance gap is still big enough to eliminate Java as the language choice for certain applications.

Furthermore, the performance situation may continue to have repercussions even after Java is selected for a particular application. Due to Java's performance track record, programmers may be tempted to design their Java programs in ways they think will help boost its performance at the expense of maintainability. The performance characteristics of past and present incarnations of the JVM have given rise to an unofficial lore in the Java community concerning how to design and implement a Java program to give it the best performance. And most of this lore goes against the grain of creating good object-oriented, thread-safe Java code.

The right way to optimize

Of course, the basic principles of optimizing any kind of software apply to Java. Java programmers who live by these principles have little cause to choose performance tweaks over good designs, except in rare cases. Unfortunately, not every Java programmer subscribes to these principles.

The fundamental principle of optimization is: Don't optimize until you know you have a problem. As Donald Knuth once said, "Premature optimization is the root of all evil." In general, you should forget about optimization and just create good quality designs and clear code. After you get your well-designed program working, if you then find that its performance is lacking, that is the time to optimize.

Another basic principle is: Measure the program before and after your optimization efforts. If you find that a particular effort did not make a significant improvement in the program's performance, then revert back to the original, clear code.

A third principle takes into account that most programs spend 80 to 90 percent of their time executing 10 to 20 percent of the code: You should profile the program to isolate the code that really matters to performance (that 10 to 20 percent), and just focus your optimization efforts there.

Once you isolate the time-critical areas of your program, you should first try to devise a better algorithm, use APIs in a smarter way, or use standard code optimization techniques such as strength reduction, common sub-expression elimination, code motion, and loop unrolling. Only as a last resort should you sacrifice good object-oriented, thread-safe design and maintainable code in the name of performance.

The optimization lore

If, despite the pleadings of your peers and the advice of software heavyweights such as Donald Knuth, you decide to design your Java program for performance rather than maintainability, here's the popular lore that explains how to do it:

  1. make methods final or static wherever possible
  2. prefer large classes over small classes
  3. prefer large methods over small methods
  4. avoid interfaces
  5. avoid creating lots of short-lived objects
  6. avoid synchronization.

The first three guidelines arise from the expense of dynamic method dispatch. Because dynamic method dispatch is expensive in current JVMs, the lore says, designing programs that have fewer dynamic method invocations at run time should boost performance. Making methods static, which doesn't require dynamic dispatch, and final, which can enable the JVM to optimize out the dynamic dispatch, should help performance.

In addition, simply favoring large methods over small ones can help reduce the number of dynamic method dispatches, because small methods would need to invoke each other more often. Likewise, preferring large classes over small ones could reduce the number of dynamic method dispatches, because objects communicate and cooperate with each other by invoking each other's methods.

The fourth guideline arises out of the relative expense of interface method invocations. Because interface method invocation is a lot slower than regular dynamic method dispatch in current JVMs, the lore says that avoiding the use of interfaces should help boost performance.

The fifth guideline attempts to address the cost of object allocation in current JVMs. Each object allocation exacts some time cost, and eventually objects may need to be garbage collected. In light of these costs, the lore advises you to avoid creating a lot of objects -- especially short-lived, temporary objects.

The sixth guideline arises out of the cost of invoking a synchronized method as compared to a non-synchronized method. In current JVMs, synchronization is very expensive. Thus, the lore says you should avoid using synchronization altogether.

What you get

Although you may see some performance improvements by following the guidelines listed above, you will almost certainly make your code harder to read and maintain. Is the tradeoff worth it?

Well, one thing to keep in mind is that Sun's next-generation virtual machine is aimed directly at the problems the lore's guidelines are trying to address. Hotspot tackles each performance problem of current JVMs. In the process, it should eliminate the "performance excuse" for creating poor designs and writing bad code.

Hotspot and adaptive optimization

The Hotspot VM is a collection of techniques, the most significant of which is called "adaptive optimization." In fact, this technique gives Hotspot its name.

The original JVMs interpreted bytecodes one at a time. Second-generation JVMs added a JIT compiler, which compiles each method to native code upon first execution, then executes the native code. Thereafter, whenever the method is called, the native code is executed. The adaptive optimization technique used by Hotspot is a hybrid approach, one that combines bytecode interpretation and run-time compilation to native code.

The Hotspot VM begins by interpreting all code, but it monitors the execution of that code. As mentioned earlier, most programs spend 80 to 90 percent of their time executing 10 to 20 percent of the code. By monitoring the program execution, the Hotspot VM can figure out which methods represent the program's "hot spot" -- the 10 to 20 percent of the code that is executed 80 to 90 percent of the time.

When the Hotspot VM decides that a particular method is in the hot spot, it fires off a background thread that compiles those bytecodes to native and heavily optimizes the native code. Meanwhile, the program can still execute that method by interpreting its bytecodes. Because the program isn't held up and because the Hotspot VM is only compiling and optimizing the "hot spot" (perhaps 10 to 20 percent of the code), the Hotspot VM has more time than a traditional JIT to perform optimizations.

The adaptive optimization approach yields a program in which the code that is executed 80 to 90 percent of the time is native code as heavily optimized as statically compiled C++, with a memory footprint not much bigger than a fully interpreted Java program. In other words, fast. The Hotspot VM keeps the old bytecodes around in case a method moves out of the hot spot. (The hot spot may move somewhat as the program executes.) If a method moves out of the hot spot, the VM can discard the compiled code and revert back to interpreting that method's bytecodes.

As you may have noticed, Hotspot's approach to making Java programs run fast is similar to the approach programmers should take to improve a program's performance. Hotspot, unlike a regular JIT compiling VM, doesn't do "premature optimization." Hotspot begins by interpreting bytecodes. As the program runs, Hotspot "profiles" the program to find the program's "hot spot," that 10 to 20 percent of the code that gets executed 80 to 90 percent of the time. And like a good programmer, the Hotspot VM just focuses its optimization efforts on that time-critical code.

Adaptive inlining

But there is a bit more to the adaptive optimization story. The adaptive optimization approach taken by Hotspot is tuned for the run-time characteristics of Java programs -- in particular, of "well- designed" Java programs.

According to David Griswold, Hotspot manager at JavaSoft, "Java is a lot more object-oriented than C++. You can measure that; you can look at the rates of method invocations, dynamic dispatches, and such things. And the rates [for Java] are much higher than they are in C++." Now this high rate of method invocations and dynamic dispatches is especially true in a well-designed Java program, because one aspect of a well-designed Java program is highly factored, fine-grained design -- in other words, lots of compact, cohesive methods and compact, cohesive objects.

This run-time characteristic of Java programs, the high frequency of method invocations and dynamic dispatches, affects performance in two ways. First, there is an overhead associated with each dynamic dispatch. Second, and more significantly, method invocations reduce the effectiveness of compiler optimization.

Method invocations reduce the effectiveness of optimizers because optimizers don't perform well across method invocation boundaries. As a result, optimizers end up focusing on the code between method invocations. And the greater the method invocation frequency, the less code the optimizer has to work with between method invocations, and the less effective the optimization becomes.

The standard solution to this problem is inlining -- the copying of an invoked method's body directly into the body of the invoking method. Inlining eliminates method calls and gives the optimizer more code to work with. It makes possible more effective optimization at the cost of increasing the run- time memory footprint of the program.

The trouble is that inlining is harder with object-oriented languages, such as Java and C++, than with non-object-oriented languages, such as C, because object-oriented languages use dynamic dispatching. And the problem is worse in Java than in C++, because Java has a greater call frequency and a greater percentage of dynamic dispatches than C++.

A regular optimizing static compiler for a C program can inline straightforwardly because there is one function implementation for each function call. The trouble with doing inlining with object- oriented languages is that dynamic method dispatch means there may be multiple function (or method) implementation for any given function call. In other words, the JVM may have many different implementations of a method to choose from at run time, based on the class of the object on which the method is being invoked.

One solution to the problem of inlining a dynamically dispatched method call is to just inline all of the method implementations that may get selected at run-time. The trouble with this solution is that in cases where there are a lot of method implementations, you get an exploding size problem.

One advantage Hotspot's adaptive optimization approach has over static compilation is that, because it is happening at runtime, it can use information not available to a static compiler. For example, even though there may be 30 possible implementations that may get called for a particular method invocation, at run-time perhaps only two of them are ever called. The Hotspot approach enables only those two to be inlined, thereby reducing the exploding size problem.

Note:

As used here, static compilation means compilation on the developer's computer, resulting in a native executable. It contrasts with dynamic compilation, which is done on the user's computer at run-time. A traditional C++ compiler is an example of static compilation. JITs and Hotspot are examples of dynamic compilation.

Because more information is available at run-time, Hotspot's adaptive optimization approach can yield better optimization than is possible with a static compiler. This means that a Java program running on Hotspot could potentially be faster than the same Java program compiled to a native executable. Hotspot's adaptive optimization technique enables Java programs that have high method invocation frequency (lots of focused, cohesive objects and methods) to perform well, even though they are being delivered as bytecodes in class files.

Other improvements

In addition to the adaptive optimization technique, the Hotspot VM includes several other improvements that address performance bottlenecks in current VMs. For example, Sun claims that interface method invocations in Hotspot will not have a performance disadvantage compared to other method invocations. Improved thread synchronization will make invoking a synchronized method much faster, only a little more expensive than invoking a non-synchronized method. And lastly, Hotspot uses a generational garbage collection algorithm, which reduces the cost of collecting large numbers of short- lived objects.

Conclusion

The Hotspot Java virtual machine promises to bring Java program performance on par with that of natively compiled C++. In fact, because Hotspot's optimizer can use run-time information not available to static compilers, it could eventually push Java performance past that of statically compiled C++. In this world, the performance cost of delivering a Java program in bytecodes go away, as does any performance penalty of doing good, object-oriented designs. This is indeed a rosy picture.

It is important to note, however, that the key word here is "promise." Sun says Hotspot will be released by the end of 1998. Although the combination of techniques that Sun is assembling under the name "Hotspot" sound very promising, until the VM is released to the general public, we won't know how fast it really is.

For the time being, however, we can all brandish the promise of Hotspot as a weapon in the fight against the cold-blooded sacrificing of good object-oriented designs in the name of performance. Resist the temptation: Don't do premature optimization!

A request for reader participation
I encourage your comments, criticisms, suggestions, flames -- all kinds of feedback -- about the material presented in this column. If you disagree with something, or have something to add, please let me know.

You can either participate in a discussion forum devoted to this material or e-mail me directly at bv@artima.com.

Resources

This article was first published under the name The Hotspot Virtual machine in Developer.com, May 1998.

Talk back!

Have an opinion? Be the first to post a comment about this article.

About the author

Bill Venners has been writing software professionally for 12 years. Based in Silicon Valley, he provides software consulting and training services under the name Artima Software Company. Over the years he has developed software for the consumer electronics, education, semiconductor, and life insurance industries. He has programmed in many languages on many platforms: assembly language on various microprocessors, C on Unix, C++ on Windows, Java on the Web. He is author of the book: Inside the Java Virtual Machine, published by McGraw-Hill.