The Artima Developer Community
Sponsored Link

Leading-Edge Java
Distributed Web Continuations with RIFE and Terracotta
by Jonas Bonér and Geert Bevin
August 8, 2007

<<  Page 2 of 3  >>

Advertisement

Introducing Terracotta

Terracotta is appliance-like infrastructure software that stores and shares object-oriented data between application servers. Terracotta is open-source and can be freely downloaded from the project website, http://www.terracotta.org

Terracotta allows multiple JVMs to communicate with each other as if they were the same JVM by transparently extending the memory and locking semantics of Java - as specified in the Java Language Specification - to encompass the heap and threads of multiple JVMs. Java's natural semantics of, for example, pass-by-reference, thread coordination, and garbage collection are maintained correctly in a distributed environment. It also functions as a virtual heap in which objects can be paged in and out of on an on-demand basis, meaning that the size of the clustered data is not constrained by the physical heap size (RAM).

Terracotta provides true POJO clustering and distribution transparency. It does not require any classes to implement Serializable or any other interface. Rather, the heap sharing and thread coordination is "plugged directly into the JMM (Java Memory Model)." As a result, Terracotta knows exactly what data changed in the application, and replicates only the object field data that has changed (object deltas), only to the nodes that need it, and only when those nodes need it.

Terracotta effectively uses fine-grained field-level replication instead of Java Serialization's course-grained, full object graph replication. This is all enabled by transparent instrumentation of the target application at class load time, based on a declarative XML configuration. True POJO clustering ensures that minimal to zero changes to existing code are required, and that even those changes are made at that source-code level.

Terracotta's Architecture

Terracotta uses a client-server architecture: there is a central Terracotta server to which any number of instances of your application may connect. The client-server interactions are injected transparently at runtime into your application.

The Terracotta server performs two basic functions:

  1. It manages the storage of object data and,
  2. Serves as a traffic cop between threads on all of the client JVMs to coordinate object changes and thread communication.

In its capacity as the object store, the Terracotta server stores only object data and IDs. This capacity is itself clusterable, either through the use of a shared filesystem or a network-based failover mechanism. In its capacity as a traffic cop, the Terracotta server keeps track of things like which thread in which client JVM is holding which lock, which nodes are referencing which part of the shared object state, which objects have not been used for a specific time period and can, therefore, be paged out, etc. Keeping all this knowledge in a single place is very valuable and allows for some very interesting optimizations. It is possible to scale-out the Terracotta server and at the same time keep the concerns of clustered heap management and thread coordination separate from the business concerns of your application. Terracotta also avoids the infamous "split-brain" problem of many other distributed computing architectures where there is no clear owner of object data and is, therefore, easy to accidentally wipe out an object entirely, or for multiple nodes to think they are the owner of an object's data.

Since the Terracotta server cluster keeps track of things - such as who is referencing who on each node, who is holding which lock, who are contending for a specific lock, etc - it can do some very interesting runtime optimizations. For example, it can work in a lazy manner and only send the changes to the node(s) that are referencing objects that are "dirty" and need those changes. That will make good use of Locality of Reference, and will be even more effective if the front-end HTTP load-balancer is configured to use sticky sessions, since some data will never have to leave its session (read node) and, therefore, will never have to be replicated to another node. Other optimizations include lock optimizations, such as "greedy locking," whereby the ownership of a particular lock can be transferred to a local JVM until that lock is requested by another node in the cluster; this runtime feature can give significant performance advantages by awarding "greedy locks" locally, avoiding a network-bound operation.

<<  Page 2 of 3  >>


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us