Large JVM Memory and Garbage Collection

An Interview with Azul's Gil Tene

by Frank Sommers
May 29, 2008

Summary
Although physical RAM is inexpensive, allocating large amounts of memory to a JVM instance is not generally a good idea, says Azul Systems' CTO Gil Tene in this interview with Artima. Tene shows how recent research in garbage collection and JVM implementation helps overcome the JVM's memory barrier, resulting in new types of applications.

The ability to process large amounts of data fast has been one of the Java Virtual Machine's strengths. Because physical RAM has become inexpensive, the potential to work with large data sets in memory is an attractive option for developers and Java architects.

However, as Azul Systems' Gil Tene explains in this interview with Artima, the JVM has not been all that efficient when working with memory over a few gigabytes: Garbage collecting such large memory spaces can cause application pauses that may be unacceptable in interactive systems.

In this interview from JavaOne 2008, Tene shows how recent research in garbage collection and JVM implementation techniques overcome the JVM's memory barrier, resulting in new types of applications:

Many developers believe that there are certain scaling barriers that just won't move, and that you must avoid those barriers at all cost. An example is the commonly-accepted wisdom that it's better to avoid scaling individual instances [of a Java application] beyond a couple of gigabytes of memory and a few cores on the CPU. As a result, much of the effort I see today centers around massive cloud-based scaling or multi-instance scaling.

I don't think there is anything inherently wrong with lateral scale. It's definitely something you want to be doing, but doing it prematurely results in the same problems other forms of premature optimization do. Breaking something into a thousand pieces is a good thing only if you could not build that same system out of, say, three pieces. It's a really bad thing, however, if you could.

Scaling up a single instance of a Java application has not been easy until relatively recently. The chief reason has to do with how the JVM manages memory. You might think that memory is cheap, and you can start out with a handful of gigabytes, and keep adding RAM to a system as your scaling needs grow. The problem is that that approach just doesn't work with most JVMs.

Once you exceed a few gigabytes or two of actual data, your JVM is going to have to garbage collect and manage that memory periodically. When it does that, the JVM's worst-case behavior is going to come to bear on the system. The worst-case behavior in a multi-gigabyte environment is that the system goes away for tens of seconds, or even for a few minutes. During the garbage collection pauses, the system appears as though it's doing nothing.

While that's acceptable in batch operations, it's unacceptable on anything interactive, or something that impacts business real-time. Imagine, for instance, a data grid cache, where you're holding a lot of information, and you have a lot of applications mining that information. Suppose that the data grid went away for two minutes. All the applications that need to access that data then go away for that time period, too.

Business-critical applications have to be there all the time, and have to be responsive. Think of a router, for instance, and whether it'd be OK for that to go away for two minutes. Building applications with infrastructure-level quality means that you can't have your JVM garbage collection pauses cause minutes, or even tens of seconds, of downtime.

The scaling problem with garbage collection has not been addressed for about the last ten years. Recently, our company, Azul, has addressed it, and I anticipate that others will also address that, too. It's an area I would call a neglected point of scale that we've now hit against. Because it's been neglected, people have assumed that it's a wall, and that they just can't grow above a certain amount of memory in a single JVM instance.

If you can solve that problem, however, as we have, you can solve some very interesting problems efficiently. The problems you can solve that way provide a complete in-memory solution: The ability to suck the entire problem into memory, rather than dealing with that problem against a database, a file store, or even a distributed cache. The ability to place your entire dataset into memory and be able to operate on it quickly and with in-memory speeds makes a big difference.

We've seen that done to entire product catalogues, or viewable items, where a system loads that data into memory, and access to that data then becomes so fast that you can do all sorts of interesting things with the data.

Similar examples come from the gaming world, where you need to keep in memory a lot of dynamic data: The size of a virtual world is often limited by how much data you can place in one memory space. Those types of worlds allow the interaction of tens of thousands, or even millions of, people.

If you don't need to take an unacceptable penalty for garbage collection, the in-memory applications get very interesting when you get into the tens, or hundreds, of gigabytes of memory space. We've seen people that have grown to that size, starting with a few initial gigabytes of RAM, and being able to scale using our JVM because garbage collection didn't kill their performance.

What has been your experience working with large amounts of JVM memory?

Post your opinion in the discussion forum.

Resources

Azul Systems:
http://www.azulsystems.com

Talk back!

Have an opinion? Readers have already posted 7 comments about this article. Why not add yours?

About the author

Frank Sommers is Editor-in-Chief of Artima Developer. He also serves as chief editor of the IEEE Technical Committee on Scalable Computing's newsletter, and is an elected member of the Jini Community's Technical Advisory Committee. Prior to joining Artima, Frank wrote the Jiniology and Web services columns for JavaWorld.