Continuations in Java

A Conversation with Geert Bevin

by Bill Venners with Frank Sommers
March 1, 2007

Summary
Continuations refer to a functional programming technique that allows you to save the current execution state of a thread, with the possiblity of suspending and later resuming execution. Continuations have been incorporated into several Web application frameworks, including RIFE and WebWork. In this interview with Artima, RIFE project founder Geert Bevin discusses how continuations can simplify complex workflows, and how they are implemented in RIFE.
Geert Bevin is the creator of the RIFE Web application framework, and is founder of Uwyn, a Belgian consultancy. One of RIFE's more innovative features is an implementation of continuations in pure Java, a technique especially suitable for workflow-like applications, such as Web-based forms. Bill Venners spoke with Bevin at the 2006 JavaPolis conference about how continuations can help developers simplify complex workflows.

Continuations explained

Bill Venners: What are continuations?

Geert Bevin: A good way to think about continuations is to consider what happens when you save the state of a computer or online game. Most games allow you to halt at any time, save away the state of that game, and then start the game up again. They let you save and restart a game as many times as you like. And some games go even further: if you took a wrong turn somewhere, you can go back to a previously saved game, and take another direction.

Continuations offer a similar benefit in the context of Java Web applications. Instead of creating an external control flow as individual steps that are tied together with navigation rules or state transitions, continuations provide a much simpler model—almost as simple as if you were doing console programming in the early DOS days.

With continuations, when you need to get a response from a user, you only need to say so: the execution of the server-side process will then halt and preserve everything that is going on for that particular process. When the next request from the user comes in, the server-side process resumes exactly where it left off before. It's just like when you save and resume a game.

Continuations free you from having to worry about where you are in the process, or what next steps you can take, or which elements of the session are entered into at what point in the execution. Continuations handle that for you automatically. Since you have all those previous continuations—saved games, as it were—when the user presses his browser's back-button, continuations automatically handle the fact that a previous state corresponds to that back button click. That makes Web applications more usable and behave more as the users expect.

The way that works is that continuations let you save the method call stack as well as the variable stack. In the Web application context, you don't capture everything, of course, because everything would mean capturing the startup of the servlet container and so on. What you do capture is the execution of a particular action on the server. There is a point from which you start capturing. We call that a partial continuation, because it only captures the method call stack and variable stack from a certain point onwards.

The implementation of RIFE continuations

Bill Venners: How does that work in the implementation?

Geert Bevin: Java doesn't support this natively. The JVM does something like what I just explained internally, to be able to handle exceptions, for instance. But the VM doesn't expose any of that data to developers.

The only way to access that information currently is to re-write the bytecode that gets loaded into the VM. What we currently do in RIFE is that at class-load time we detect pause() method calls in a class. These act as a markers. The class loader will detect these markers and then modify the bytecode to actually aggregate that information in parallel to how the JVM tracks the method call stack and variable stack.

The continuations infrastructure maintains a parallel stack that references the actual data. The mechanism is quite light, because you don't store the actual data twice in memory: you create references to the actual stack data in building up that parallel stack. When a continuation is to be preserved, the state of that parallel stack is preserved.

The bit of overhead that occurs is the result of the fact that you typically make a deep clone of the actual data at the next continuation. This is needed because each continuation needs to be totally independent of another one. To return to the game analogy, it would be strange if potions you drank in the last saved game were also gone from earlier games—that data needs to be saved independently of earlier instances.

The only restrictions developers have in using continuations is that the data structure must be cloneable or serializable so that the infrastructure can create carbon copies of those data structures. However, even that's not mandatory, and is only required when you want that previous state functionality to work—which is what you typically desire in a Web application to make the back button work.

People are using continuations in other application domains as well where they don't want those previous versions around. A good example is an event-driven system where you have agents polling for state changes. When some state changes, you continue the execution. Systems like that can be heavily optimized with continuations, and people are starting to adopt RIFE continuations for event-driven systems for that reason.

In event-driven systems, when you arrive at a certain point where you need to have a trigger in order to continue, you create a continuation, and stop the execution. You don't have any polling at all. The only thing you have is an event that wakes up all the related continuations. You can then start executing until the next point when you need to have more information available from an additional event. That creates a lot less CPU overhead, and performs much better than the alternative polling-based approach.

Bill Venners: Let me see if I can paint this picture: I am using RIFE, and a request comes into the server, and a thread gets taken out of the pool and starts handling that request. At some point, as the call goes down a call stack, that call is going to hit the starting point of your partial continuation. You're going to start making a companion stack object of some kind. You have instrumented those classes, those controllers...

Geert Bevin: We call them elements, but in WebWork they call them actions.

Bill Venners: So those elements have been instrumented such that when there is a method call to process the request, and that method starts changing local variables, you just figure out what the local variables are and reference those from the parallel stack.

Geert Bevin: At the byte-code level, when the method pops variables from the stack, exactly the same operation will be performed on the parallel stack. We just add a piece of byte code that mimics what goes on exactly. Then, when a continuation has to be resumed, there's a bit of added byte-code that interacts with the state of the stack, and puts all the variables in the correct location so that they are present when the executing thread passes that point. At this point, it's as if nothing happened—execution resumed as if nothing happened.

The second problem is how to arrive at the point to resume a continuation. You want to skip over code that you don't want to execute. That's easily done in byte code, because you can maintain a tableswitch in byte code that allows you to jump to a particular location. You can create a unique label for each continuation point, jump to that label in the switch table, and then you know exactly what the stack for that label of variables was. You can restore that stack, and continue executing from that point on.

I should mention that there are several ways to use continuations. In Web development you are mostly interested in pause(), which means you arrive at a particular point, and want to continue from where you left off.

A second possibility is a call-answer continuations. With call-answer, you have a method call that calls another element or action. An example is in creating modal dialogs. Suppose you have a delete screen where you ask the user, "Do you really want to delete this entity?" You want to make sure that's want they really want to do. You pause the execution of the original intent, which was to delete that entity, and start another action. If the user presses "Yes," that will result in a true variable as a response to the original execution. The delete action will then receive the true or false answer from the call and continue execution from where it left off.

A third possibility is step-back continuations. That allows you to easily step back to a previous continuation without having to use a browser's back button. You typically use this in wizards, when you want to provide a back button on the form itself and not rely on the browser's back button. For that, we rely on the previous snapshots. That's why we have three different method calls: pause(), call(), and stepback(); and not just one yield() method.

Bill Venners: When I call one of those methods, how do you go back? Do you throw an exception?

Geert Bevin: We throw an Error because almost nobody catches errors until the point where the Web framework starts to execute that action. Control goes back to the initial execution point that catches the error, and then we know exactly where we need to continue from. The continuations manager will then be consulted to make sure that the required continuation is stored away and is available for re-use afterwards, when the next request comes in.

Each continuation has its own unique ID that has been preserved either through a hidden parameter or a query string parameter. When that ID comes in, the Web framework finds out that continuation ID, fetches the corresponding continuation data from the continuations manager, knows that it has to restore that continuation, and jumps over the code to restore the specified continuation.

Continuation IDs

Bill Venners: I'd like to understand the mechanics of how that ID comes back from the client. Is it in the URL?

Geert Bevin: You have unique IDs for each individual continuation step. They can come back in any way, even in an email message event. Suppose you have a system where a confirmation email is sent to make sure the person really is who he says he is, and a reply is expected within a certain time-frame, say within thirty minutes.

You just send that ID as a query parameter in the confirmation URL the user has to click. When that ID comes in, it is looked up in the continuation manager, which then resumes the execution of the continuation referenced by that ID. If the user doesn't access that URL within thirty minutes, the continuation corresponding to that ID will be flushed away automatically by the continuation manager and that code can't be resumed anymore.

I want to stress that in the continuation programming style, everything is a continuation. You've got snapshots at each individual step of execution. By putting in those method calls, such as pause(), you indicate one specific location you're interested in.

To make that practical, the code that leads up to that method call has access to the unique ID of the upcoming snapshot. Because you know what that unique ID is going to be, you can already add that ID to a form as a hidden parameter. If that ID was not available in advance of the start of the continuation, you would not be able to add that ID to a form: when execution hits that pause() method call, it stops. Everything you need to show the user must already have been sent through the output stream of the servlet container.

Bill Venners: Is it the continuation manager that manages those IDs completely separate from the session? If so, how do I replicate the continuation manager's state, if I want to have stateful fail-over?

Geert Bevin: Currently, we're working on one solution, and that's using Terracotta DSO, which is open-source now. I'm collaborating with Jonas Bonér to make RIFE continuations work seamlessly with Terracotta DSO. We're still tackling a problem that's not related to continuations per se, but to RIFE. As a framework, RIFE is extremely dynamic. For example our template classes are usually created on the fly. Terracotta has to have access to all the class files to be able to replicate the data. So we're still figuring out a mechanism for how we can do that with run-time generated classes. We're expecting to finish that work very soon.

The usefulness of continuations

Bill Venners: The general continuations selling point is that continuations provide a simpler programming model to make a loop: If there are 28 steps to a process, and you want to be able to go back and forth between those steps, continuations promise to make that easier. Without them, you have to have a state transition diagram, and for each of those 28 steps possibly a controller, too. But how often are loops really that complicated? How useful are continuations in practice?

Geert Bevin: It really depends on the kind of Web sites you're building. For a public Web site with pages that people just read most the time, you don't need continuations. There is no point. But if you've got what I call islands of functionality—a transactional process a user has to go through—and that only can complete when the user finished all the steps, continuations make that extremely easy to program. Examples are a checkout process, a questionnaire, wizards, stuff like that.

With continuations, you just program the process steps in Java, use the native Java control statements, and then put the pause() method calls whenever you need interaction from the user.

The additional benefit, which sort of blows people away when I demonstrate it, is that you can debug you flows with a regular Java debugger. You can set your breakpoints within your implementation and step through to see exactly what happens. You can set watches, breakpoints, conditional breakpoints, logging breakpoints, everything just works. When you have very complex flows, you can benefit from all the tools you've used to using to debug the Web application flow.

Continuations also provide a natural way to solve the double-submit problem: In transactional processes, such as a check-out process, you don't want a submission—an order, for instance—to pass through a second time.

Continuations solve that easily. You typically have a continuation tree. Most of the time that will just be a linear succession. But if people press the back button and follow another trail, you have a tree. At the end of the process, you know which tree you're in. When that transactional process finished and the order has been placed, you just remove the tree.

When a person presses the submit button again, or pushes the back button, or does anything with the previous state, that will just not work, because you can't resume a continuation that's no longer there. Then you can show a message such as, "Please start a new order." There is no longer a risk of re-doing that processing. It's supposed to be transactional and occur only once.

Bill Venners: Is there a method call in the API to say "Start a tree?" If I start a tree at some point, and they back out past the start of that tree, and then go forward again so they are not part of that same tree, how do I know what to delete?

Geert Bevin: As long as you're resuming a previous continuation, our implementation keeps track of all the relationships. If you don't resume from a previous continuation, then you just start at the beginning: that will be a new checkout process, because the user will not be in the middle of something.

For each execution thread, there is a thread-local variable for the active continuation. You can just walk through data structures to figure out all the related continuations and their siblings. In the case of the check-out process, you walk through and delete the whole tree at the same time. There is a static removeContextTree() method on what we call ContinuationContext class.

Continuations in Jetty

Bill Venners: Jetty provides some kind of continuations framework, too. How does that compare with RIFE's?

Geert Bevin: Jetty does what I'd call request-thread parking. Instead of tying up a socket on your servlet container for a long-lived operation—an asynchronous operation, for example—Jetty frees that socket up with some kind of a continuation, and re-activates that thread again when the result arrives.

Jetty's mechanism is similar to the event-based programming model I was talking about earlier. The benefit is that you free up resources while you're doing the calculation. What Jetty does can be implemented with continuations, but are not real continuations. Jetty doesn't need to capture the state of a continuation, it doesn't need to be in the middle of a method and capture everything that surrounds that execution. Jetty just has to resume another servlet execution thread, and know where to pick up execution.

Bill Venners: You mean the thread is in a wait state, and doesn't get released?

Geert Bevin: I can't remember exactly what the Jetty code does, but I know that it does just what is needed for that use-case. I know that Joe Walker has been trying to use that to add continuations to DWR and has had problems because it really didn't provide everything that was needed.

Standardizing continuations

Bill Venners: In your talk, you mentioned about a possible JSR proposal, a way to standardize continuations. Can you tell us more about that?

Geert Bevin: There are already several implementations and approaches to continuations, depending on the use-case. There is an Apache Commons project, JavaFlow. There is JauVM that layers another VM on top of the JVM to support tail-calls and continuations. Given that there are all these efforts, it's important to standardize the API from a library and framework developer's standpoint.

That means standardizing on an interface to capture that local variable stack and execution stack. Since the JVM doesn't implement that yet, any implementation underneath is very targeted to your problem domain. Longer-term, the JVM would ideally open that access up.

The approach is to create a JSR to provide continuations functionality with a common API that will initially be implemented through simulated layers, as we're doing it now with bytecode re-writing in RIFE. When developers show enough interest in it, we can create a push to add to the JVM what is needed. To propose additions to the JVM now would be premature.

At all the conference talks I've been giving, I have a lot developers come up to me, saying that they want to use continuations, but can't, because the current implementations require an alternate library that does bytecode re-writing, something they would never be able to get into their projects. Having a common API that has been standardized through a JSR would make it much easier for them to use continuations as part of their working tool set.

I started to write a draft for the JSR proposal in November. That draft is available on a Wiki at http://rifers.org/continuations/jsr. There are already comments that resulted in changes to the Wiki text, and a lot of people in the Web development space have committed to be part of the initial expert group, if the JSR is accepted. For example, Patrick Lightbody of WebWork, Joe Walker from DWR, Alexandru Popescu who created the software behind the InfoQ Web site.

Trying out RIFE continuations

Bill Venners: So if I want to try this out, what do I do and where do I go?

Geert Bevin: The RIFE project Web site has more information, especially in the RIFE examples section. Two of those examples are especially dedicated to continuations [Editor's note: examples 4 and 8]. There are also videos in the theater section of the site that provide more examples.

In the coming weeks, I will be updating the standalone RIFE Continuations library to have the same capabilities as the framework itself provides. Developers will then be able to try it out in situations that don't involve Web application development.

Share Your opinion

Have a question or opinion about Continuations in Java? Discuss this article in the Articles Forum topic, Continuations in Java.

Resources

The RIFE project home page:
http://www.rifers.org/

Talk back!

Have an opinion? Readers have already posted 7 comments about this article. Why not add yours?

About the authors

Bill Venners is president of Artima Software, Inc. and editor-in-chief of Artima Developer. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project, whose ServiceUI API became the de facto standard way to associate user interfaces to Jini services. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community.

Frank Sommers is president of Autospaces, Inc.