Survival of the Fittest Jini Services, Part II

Use Transactions to Coordinate the Reliable Interaction of Jini Services

by Frank Sommers

July 15, 2001

First published in JavaWorld, July 2001

Summary

Developers should distinguish between the systems themselves and the computations they wish to perform on them. While a distributed system, or parts of it, might fail occasionally, computations performed on those systems can still be highly dependable. This shows how multiple Jini services can dependably cooperate via transactions.

Failure, according to Merriam-Webster's Dictionary, is a state in which something is "unable to perform a normal function." Inasmuch as a network's normal function is to transmit information between two or more hosts, experience shows that most networks often cannot perform that function as expected. In other words, failure is as much a characteristic of the network as is its normal operation.

Read the whole "Survival of the Fittest Jini Services" series:

In many aspects of life, we have learned to live in the presence of failure. In a large city, many new shops spring up all the time, while others close their doors for good. Unless you are a shop owner, you are not likely to lose sleep over that fact. Instead, you are interested in being able to obtain the goods you are looking to buy, at a reasonable price and in close proximity.

Taking a similar approach to building Jini-based distributed systems might be helpful. We cannot make a large network, such as the Internet, more reliable. But we can make the computations we wish to perform over that network as reliable as possible. Your users -- whether people or other Jini services -- are primarily interested in the computations your service provides. Ensuring the reliability of those computations in the presence of network and component failures will likely lead to your service's longevity.

By reliability I mean a set of guarantees that hold, no matter what. In other words, as long as the computation produces a result, that result should keep with a set of guarantees. If the computation cannot ensure those guarantees, then it should abort and not return a result.

We are all familiar with this notion of reliability: When people wish to accomplish a goal together, they typically agree to a verbal or written contract, which thereafter binds each party to its terms. Thenceforth, the participants perform all their actions related to the common task in accord with those promises. And, should the parties fail to keep their promises, all actions under contract will produce "unreliable" and unpleasant results.

The equivalent of such "rules of the game" between components of software-based distributed systems is the transaction: components participating in a computation agree to a set of rules, and each component thereafter adheres to those rules during the computation.

In distributed systems, such as Jini networks, components typically reside on distant network hosts. This is significant, because it means that no component can, by itself, ascertain whether the other components adhere to the rules. A component can only implement the rules and then communicate to the others that it, indeed, keeps with those rules.

Distributed transactions, therefore, are made up of the rules (semantics) by which the services must abide, and a coordination mechanism between the services that ensures that the rules hold for the whole computation. If even one service indicates that it cannot guarantee its promise, that mechanism will abort the transaction.

The Problem of Four Generals

The story of the four generals, inspired by Leslie Lamport's "Byzantine Generals," illustrates the kind of guarantees distributed transactions must promise, and the way participants in it might communicate. In this example, the generals and their armies are metaphors for distributed services, and carrying out their battle plan is analogous to a distributed computation. This scenario is known as the coordinated attack problem.

The generals, each commanding an army, plot to capture a medieval fortress. Alone, no one army can force the besieged defenders to surrender. Together, however, they are more than a match for the city's defenses. Therefore, to win, the generals agree to fight only given the following battle conditions:

They must all attack at the same time. If any one army calls off the attack, the others must immediately retreat.
None of the armies may violate its own internal rules during the battle.
The attack must be a surprise. All preparations must be kept secret and made in isolation from all but the generals' most trusted confidants.
Victory must be permanent; for instance, the armies must be ready to occupy the city after the battle.

The four conditions for the generals' battle plan are the ACID guarantees:

A stands for atomicity: Either all the armies attack, or none of them do. One or two armies attacking would cause the battle to be lost, and is not permissible.
C means consistency: The armies must maintain their internal rules for order (consistency).
I stands for isolation: All the preparations for the attack must be hidden from those not involved in its planning. If the attack is called off, no one outside the generals' close circle should sense that any activity has taken place.
Finally, D implies durability: The results of the battle must survive the fight itself.

Next, the generals need a way to coordinate their activities. They settle on the following communication protocol:

Prior to the battle, each army makes the necessary preparations. When each is ready, its commanding general lets the others know that he is prepared to move forward.
Once each general is sure that all the others are prepared, he sends another message to the effect that his troops are now committed to battle (in effect, are marching against the fortress).

This protocol consists of two phases: The first indicates preparedness; the second, a fully committed state. It is often called the two-phase commit protocol -- or 2PC protocol, for short. Jini services participating in a distributed computation must use a similar mechanism to coordinate a transaction's completion, or commitment.

The only remaining concern for the generals is how to exchange messages. To indicate preparedness, each general must send a message to the others. Between the 4 generals, 12 messages are exchanged for the protocol's first phase. For N generals, N * (N-1) messages must be sent for each stage. This is bad news: If additional armies were involved in the attack, many more messages would be needed. For 10 generals, this arrangement would require an unmanageable 90 messages. Should any message get lost, the battle could not begin, since the generals could not be sure that the conditions were right for attack.

Instead of sending messengers directly to each other, the generals could decide to set up a central command post. Each general would send a messenger only to this command post to obtain the status of the other armies. With this arrangement, only two messages from each army are passed for each protocol stage: one delivering a general's message, and the other coming from the command post with an order to either proceed with or abort the plan. With this in place, only 8 messages are needed to indicate battle preparedness by our 4 generals. If 10 generals must coordinate their movements, then introducing the central post reduces the required messages from 90 to only 20. This command post is called the transaction manager, or coordinator, for the 2PC protocol.

Figure 1. Communication messages for the two-phase commit protocol

A Jini Bookstore

The Jini Distributed Transactions Specification defines a transaction manager, which is a Jini service, and also describes transaction participants and transaction clients. Together, these entities make up a Jini distributed transaction. In addition, the spec defines default transaction semantics for the ACID properties. The net.jini.core.transaction and net.jini.core.transaction.server packages provide the API for services to interact with the transaction manager, and also offer classes for the default transaction semantics.

By separating transaction semantics from a coordination mechanism, the transaction specification allows for other, user-defined transaction semantics. These semantics might promise guarantees other than, or in addition to, the ACID ones, but transactions using those semantics could still employ the 2PC protocol.

To illustrate the benefit of transactions in service-to-service interaction, we will construct a Jini bookstore service. Like any bookstore, the service lets you search for and order books. Unlike most bookstores, however, its implementation relies on other Jini services for payment processing and order shipping. In this and the next article in this series, we will dissect the

BookStore service to see how it provides high reliability even in the presence of intermittent network failures.

The bookstore makes available on the Internet (possibly in public lookup services) a Jini proxy object, which exposes something similar to the following service interface:

public interface BookStore { public Collection findBooks(Book template) throws java.rmi.RemoteException; public OrderConfirmation buyBook( Book book, Account creditCard, Customer customer, Address shipTo, int daysToDelivery) throws NoSuchBookException, CreditCardException, DeliveryException, BookStoreException, java.rmi.RemoteException; }

The findBooks() method consumes a template and returns a collection of Book objects satisfying the template's specified fields (for instance, the author's name). The buyBook() method is more involved. It requires us to specify the desired book, as well as objects representing a credit card, customer information, a shipping address, and the number of days in which we want the book to be delivered. A successful purchase returns a confirmation, which includes the information a customer would need to make a delivery complaint. The buyBook() method declares a number of runtime exceptions to indicate failure in processing the purchase request.

Credit card companies provide Jini services to facilitate account debits and credits. The interface of the CreditCard service might be as follows:

public interface CreditCard { public ChargeConfirmation debit(Account account, Charge charge) throws NoSuchAccountException, CannotChargeException, CreditCardException, RemoteException; public PaymentConfirmation pay(Account account, Payment payment) throws NoSuchAccountException, CreditCardException, RemoteException; public CurrentBalance getBalance(Account account) throws NoSuchAccountException, CreditCardException, RemoteException; }

The methods of this interface let the user charge her account, make payments, and inquire about the current available balance. Each method returns an object representing the result of the action, or, if the action did not succeed, any declared exception.

The final piece in the bookstore puzzle is the shipping company. Its Jini service proxies offer the following functionality:

public interface ShippingCompany { public PickupGuarantee checkPickup(Address origin, Address destination, PackageDesc package, int daysToShip) throws ShippingException, RemoteException; public PickupConfirmation schedulePickup(PickupGuarantee guar) throws NoSuchGuaranteeException, ShippingException, RemoteException; }

The checkPickup() method requests the origin and destination addresses, a description (including the package's approximate weight) and the requested number of delivery days. If the shipping company can deliver the package within the specified timeframe, it returns a PickupGuarantee object. This object contains the delivery price and an expiration time that indicates how long the guarantee remains valid. On the other hand, if a shipping company cannot guarantee the requested delivery, the method returns a null value.

Ideally, we'd want the best delivery price. Therefore, we'd inquire with many companies by calling checkPickup() on their service objects. This way, we can trade time for money: the more companies we inquire with, the better price we might obtain -- although it takes longer to make all those method calls. (A shipping company might offer a good price, but set a short expiration time for the PickupGuarantee -- in other words, if you act now, you can ship your package cheaply. This would be the equivalent of a sale on the Jini-enabled, service-oriented Web.)

Once we choose a shipping company, we must pass the appropriate PickupGuarantee object to its schedulePickup() method. That company then returns a PickupConfirmation object, representing a receipt for the scheduled package pickup. This method also declares a number of exceptions, should a problem occur when accepting the pickup request.

Figure 2. Interaction of services in support of a Jini bookstore

The BookStore service must provide the ACID guarantees when buying a book:

Buying a book must be an atomic operation. Either the credit card is charged, the delivery is scheduled, and the book is taken out of inventory, or none of those operations should take place. In addition, we must also receive the PurchaseConfirmation object; otherwise, we won't know whether our purchase succeeded or failed.
Placing an order must leave all the services in a consistent state. This is specific to each service -- for instance, our credit card should not be overcharged, or a delivery should not be scheduled on a route a shipping company doesn't serve.
Each step in the purchase must be performed in isolation from other operations. Its results must be hidden from other operations until the transaction fully completes.
Consider what would happen if the CreditCard service didn't offer the isolation property. Imagine that your credit card account has an available credit of $200, and the book you want costs $50. While you're placing the order, your spouse charges $180 to your account for a purchase at ABC Department Store. When the book order transaction begins, a charge for the $50 is made on the account, causing your available credit to shrink to $150. Right after this charge registers, ABC's request for the $180 charge is denied because of insufficient credit. However, during the book order transaction, it turns out that no shipping company delivers books to your desired destination. The purchase transaction is aborted, causing the credit company to cancel the $50 charge, which now restores the available balance to the original $200. When you inquire, you are told that your account has $200 available credit, and no one knows why the charge for the $180 was denied. (Someone could consult log files; however, that certainly wouldn't reveal why the $50 charge was reverted.)

With transaction isolation, the credit card's balance would be inaccessible (locked) during the book order transaction -- the other company's charge request would have to wait. Locking the account balance trades transaction throughput for accountability: one service waits in line in order for our system to be more predictable.
The results of the purchase must be durable. The confirmation receipts and all other state changes in services must survive the transaction itself.

To guarantee these properties for the book purchase, each service must perform its operations under a transaction. In Jini terminology, the services must become transaction participants.

Towards transactional services

The first step for a service to become a transaction participant is to define transactional methods in its public service interface. Since Jini has to accommodate both transactional and nontransactional services, there is no equivalent of the transactional remote procedure call (TRPC) mechanism popular in traditional transaction processing systems. In those systems, the runtime infrastructure annotates each method call with a unique transaction identifier (TRID). By having an identical TRID, a set of operations are easily identified as belonging to (performed under) the same transaction.

Jini transactions group operations together via an object representing a transaction instance. For the default semantics, the net.jini.core.transaction.Transaction interface defines this object, which is implemented by net.jini.core.transaction.server.ServerTransaction. You need to pass in a Transaction instance along with the other parameters to make a method call transactional:

public interface BookStore { public Collection findBooks(Book template) throws RemoteException; public OrderConfirmation buyBook(Book book, Account creditCard, Customer customer, Address shipTo, int daysToDelivery, Transaction txn) throws NoSuchBookException, CreditCardException, DeliveryException, BookStoreException, RemoteException, TransactionException; }

For the credit card service:

public interface CreditCard { public ChargeConfirmation debit(Account account, Charge charge, Transaction txn) throws NoSuchAccountException, CannotChargeException, CreditCardException, RemoteException, TransactionException; public PaymentConfirmation pay(Account account, Payment payment, Transaction txn) throws NoSuchAccountException, CreditCardException, RemoteException, TransactionException; public CurrentBalance getBalance(Account account) throws NoSuchAccountException, CreditCardException, RemoteException;

And for the shipping service:

public interface ShippingCompany { public PickupGuarantee checkPickup(Address origin, Address destination, PackageDesc package, int daysToShip) throws ShippingException, RemoteException; public PickupConfirmation schedulePickup(PickupGuarantee guar, Transaction txn) throws NoSuchGuaranteeException, ShippingException, RemoteException, TransactionException; }

You will notice that each method now declares TransactionException to indicate possible failure in the transaction's processing. We still need the other application-specific exceptions for failures that occur independent of the transaction.

In addition to extending the method signatures, each service's implementation must also declare itself to be a transaction participant by implementing the net.jini.core.transaction.server.TransactionParticipant interface. By implementing this interface, the object guarantees that it can join the transaction and participate in the 2PC protocol.

Note that the service's implementation becomes a transaction participant; the service's proxy does not. The proxy runs inside the address space of whatever client retrieves it from lookup services. Therefore, the proxy's state becomes intrinsically a part of that client's state -- all computations performed on the proxy itself are local to the client.

A TransactionParticipant service must join a transaction when it receives a transactional method call. The ServerTransaction class provides a join() method that consumes a TransactionParticipant and a crash count. Your service can join the same transaction with the same crash count any number of times. All the method calls that join the transaction perform their actions under that transaction.

During a transaction, your service might crash for some reason. If such a crash causes your service to lose the changes made during the transaction, you must increase the crash count number when the service reactivates and reinitializes. Since the default semantics must guarantee ACID properties, joining a transaction with a crash count number different from the service's original crash count results in a CrashCountException. At that point, you can decide what to do: you might choose to abort the whole transaction.

Transaction Lifecycles

Once all the bookstore services are transactional, you can at last order your favorite book. The transaction client is the Jini service that initiates the transaction. The client might or might not also be a participant in the transaction. Since printing or displaying the PurchaseConfirmation is part of the book purchase transaction, we will make the BookStore service both a client and a participant.

The client follows these steps to initiate a new transaction:

It discovers the transaction manager service. Since it's just a regular Jini service, you can follow the normal Jini service discovery mechanism.
Since different objects could represent various transaction semantics, you create a Transaction via a factory class. Like many Jini entities, a transaction is a leased resource. Calling TransactionFactory's create() method produces a ServerTransaction.Created object, which bundles a new transaction with its lease.
If the client is also a transaction participant, it can at this point join the transaction.
It then passes the transaction object as a parameter in method calls to other services.

A new transaction starts out in the active stage. In this stage, the services perform their work under the transaction. For instance, the credit card service charges your account, the bookstore service locates and queries the shipping services, and a package delivery is scheduled. Finally, the bookstore service must produce a purchase confirmation. During these activities, all three services must be ready to roll back any changes they make, since the whole transaction's success is not yet guaranteed.

At some point, the client (or any other participant, for that matter) indicates that the transaction must complete. With our bookstore, this might occur right after we've displayed or printed the order confirmation, or after we've waited a set amount of time for the services to finish their work. Then the 2PC protocol drives the transaction to completion.

The transaction manager coordinates the transaction's commitment. The client (or any other participant) calls the commit() or abort() methods on the Transaction object. This in turn causes the manager to call the prepare() methods on each participant.

At this point, the transaction enters the voting stage. Each participant must vote: Is it prepared to roll forward the transaction's changes, does it need to abort the transaction, or does it not care either way (because the transaction caused no changes in its state)? The participants' possible votes are PREPARED, ABORT, or NOTCHANGED. Most significant, if any participant cannot ensure its transactional guarantees, it must indicate that fact. For example, the credit card service might not be able to save the new credit card balance.

When a participant votes PREPARED, it says, in effect, that I am now committed to the changes made under the transaction. This implies that, given the order to roll forward, the participant guarantees to commit the changes -- it cannot fail. Among other things, this means that the changes have already been saved in persistent store (to guarantee the transaction's durability property).

When, and only when, all participants vote either PREPARED or NOTCHANGED, the coordinator calls the commit() method on each participant. When all participants commit their changes, the transaction is in the COMMITTED state and can thereafter be forgotten. (Transactions typically don't persist after they've completed, although the spheres of control notion I mentioned in the first part of this series assumes that they do, which opens up many interesting possibilities.)

The commit() call instructs a participant to finish the transaction, which means that the participant no longer needs to enforce the transaction guarantees. The results of the changes made during the transaction now become visible to objects outside the transaction, locks held by the transaction are released, and so forth. The commit() method doesn't have a return value, since a PREPARED vote previously implied a guaranteed successful commit.

If any participant votes ABORT, the transaction manager calls the abort() method on all participants, instructing them to roll back all changes made during the transaction and release any resources they've reserved.

In this sense, the transaction provides a set of computation guarantees: if any participant decides that it cannot, for some reason, abide by the transaction's semantics, the entire computation will be cancelled rather than produce an unreliable result and unpredictable side effects.

Figure 3 shows the transaction's different states during the 2PC protocol from the transaction client's point of view:

Figure 3. The client's view of a transaction

Figure 4 illustrates the interaction between a participant and a transaction:

Figure 4. The participant's view of a transaction

Finally, Figure 5 illustrates how a manager drives the transaction to completion:

Figure 5. The manager's view of a transaction

Putting it All Together

The following code illustrates the purchase of a book under a transaction:

public class BookStoreImpl implements BookStore, TransactionParticipant { .... /** * Calling this method initiates the purchase transaction. * Since the purchase itself might be part of a larger transaction, * we allow a Transaction object to be passed in the * method call. An example would be a transactional email service * that guarantees the delivery of the confirmation: If the delivery * fails, the transaction is aborted. In that case, the email service * will be the transaction client. If null is passed * in as the Transaction object, a new transaction will be created, * and the BookStoreImpl becomes the transaction's client. */ public OrderConfirmation buyBook(Book book, Account creditCard, Customer customer, Address shipTo, int daysToDelivery, Transaction txn) throws NoSuchBookException, CreditCardException, ShippingException, BookStoreException, RemoteException, TransactionException { boolean client = false; TransactionManager txMan = null; ServerTransaction sTxn = null; Lease txnLease = null; //If the transaction is null, we'll be the client, and we also //need to create a new Transaction. if (txn == null) { client = true; txMan = discManager(); //This object is a bundle of the ServerTransaction + its lease. //We specify a lease of 3 minutes for the transaction. Transaction.Created ct = TransactionFactory.create(txMan, 180 * 1000); sTxn = (Transaction)ct.transaction; txnLease = (Lease)ct.lease; //Manage the transaction's lease. The implementation is not shown. manageTrLease(txnLease); } else { //We can only handle ServerTransaction. if (txn instanceof ServerTransaction) sTxn = (ServerTransaction)txn; else throw new TransactionException("Unknown transaction semantics."); } //Everything from here will be performed under the transaction. ////////////////////////////////////////////////////////////// try { //Since we need to ensure that we save or print the confirmation, //we also will have to join the transaction. sTxn.join(this, 0); //Call each method of the other services, passing the transaction object //as parameter. Each service must join the transaction as well. //Decrement the inventory count for this book. //If no more books are available, a NoSuchBookException will be //thrown. We catch exception and, if we are the transaction client, //cause the transaction to abort before returning the exception //to the caller. int remaining = bookDatabase.decrementInventory(book, sTxn); //Obtain the best deliver price for the book. The implementation //is not show, but is available in the full example. //This might return a ShippingException, which we catch and //and handle similarly to the NoSuchBookException. PackageDesc packDesc = PackageDesc.createDescription(book); PickupConfirmation pickupConf = ShippingSelector.schedulePickup(wareHouseAddress, shipTo, packDesc, int daysToDelivery, sTxn); //Finally, we attempt to charge the credit card. TotalPrice //includes the book's price, local tax, and shipping charges. //TotalPrice is an implementation of Price. //The system determines tax, based on location. //If the charge attempt fails, the exception will be handled //similar to the other service-specific exceptions. TotalPrice price = CashRegister.computePrice(book, conf); Charge chg = new Charge(price); ChargeConfirmation chgConf = card.debit(account, chg, sTxn); //Now that we have succeeded in all the operations with other //services, produce and save the OrderConfirmation. This must //succeed before the transaction can be committed. //saveConfirmation may return a CannotSaveException. OrderConfirmation orderConf = new OrderConfirmation(pickupConf, chgConf, book); saveConfirmation(orderConf); //If we are the client, commit the transaction. if (client) /////////////////////////////////////////// //Transaction ends here, if we're the client. /////////////////////////////////////////// sTxn.commit(); //Return orderConf. return orderConf; } catch (Exception e) { /////////////////////////////////////////// //Transaction ends here if we abort. /////////////////////////////////////////// if (client) abortTransaction(sTxn); if ((e instanceof NoSuchBookException) || (e instanceof ShippingException) || (e instanceof CreditCardException) || (e instanceof TransactionException)) throw e; else throw new BookStoreException(e.getMessage()); } } /** * Discover TransactionManager. This method should really * declare more specific exceptions. */ private TransactionManager discManager() throws Exception { ServiceDiscoveryManager serviceDiscoveryManager; ... Class[] trTypes = {TransactionManager.class}; ServiceTemplate tmpl = new ServiceTemplate(null, trTypes, null); ServiceItem item = serviceDiscoveryManager.lookup(tmpl, null); TransactionManager tm = (TransactionManager)item.service; return tm; } /** * We are not using a transaction for this method. */ public Collection findBooks(Book template) throws RemoteException { //Find books matching non-null fields of the specified template. ... } ... }

The following list explains the code in more detail:

The bookstore service's buyBook() method is invoked. It consumes the selected Book, an object representing the customer's credit card account (which might be obtained from a smart card or some other portable storage device), some information about the customer (the Customer object; again, this could come from a smart card's onboard storage), the shipping address, and an integer denoting the desired number of delivery days (these last two are probably input via a GUI, such as a service UI). Most important, it also takes a Transaction instance as a parameter.
If null is passed in as the transaction parameter, a new transaction is created. The bookstore service discovers a TransactionManager service, then obtains a new ServerTransaction object from the TransactionFactory, passing the TransactionManager and a lease time of one minute as arguments. Essentially, it becomes the transaction's client. If an existing transaction was passed in, the book purchase becomes part of that transaction. In that case, the bookstore service does not act as the transaction's client.
Since the bookstore service must ensure that it prints or saves a purchase confirmation, it joins the transaction as a participant (note that it implements TransactionParticipant).
Next, the bookstore implementation removes the desired
book from inventory, and discovers a credit card service, as well as several shipping services. For the latter, it tries to find all the shipping services that can deliver the package to the specified address within the desired timeframe. It then selects the service that delivers for the least amount of money. This selection is delegated to a helper object inside the bookstore service implementation.
The bookstore service now performs method calls on the selected CreditCard and ShippingCompany proxies, passing in the Transaction object as a parameter. These services are then obligated to join the transaction (calling ServerTransaction's join() method).
If all goes well, the bookstore service receives confirmations from both the credit card charge and the scheduled package pickup. It then creates the PurchaseConfirmation object. Finally, it saves the PurchaseConfirmation persistently, and possibly even displays it to the user.
If the bookstore is also the transaction client, it calls commit() on the transaction. At that point, the 2PC protocol starts: The transaction manager calls prepare() on all the participants, expecting their vote of either PREPARED, NOTCHANGED, or ABORT. If all voted either PREPARED or NOTCHANGED, the transaction manager calls commit() on all the participants. At that point, the transaction is officially completed, and all the services can release the resources and locks held during the transaction. However, if the bookstore is not the transaction client, it should not attempt to finish the transaction. If any participant votes ABORT, the transaction manager will invoke each service's abort() method, instructing them to undo all changes made under the transaction.

You might have noticed an interesting twist here: In some situations, you want to ensure that the customer can actually print or display confirmation. For instance, if the printer or display fails, you'd rather the transaction be aborted. This might also be the case for an airline ticket sale or the filing of a tax return. The challenge with these real-world activities is that it is very difficult to undo them. If, after the confirmation has printed, the credit card service decides to abort the transaction, then the printed confirmation becomes invalid. But it's already in the customer's hands!

The only solution here is to ensure, as much as possible, the success of online activities first, and only then perform the offline actions associated with the transaction. That is why we only saved the confirmation during the transaction, and left it up to the customer to print it at his convenience. When you need physical proof to be part of the transaction, you probably need to print a cancellation note as well when you abort it. Of course, printing that note can fail, too.

This is one area where Jini-enabled devices will simplify life: printers, cell phones, email systems, and storage devices can all become transaction participants along with business-specific services. If you need that ticket to print out, that confirmation number to display on your cell phone, or that email message to be delivered, the transaction will not complete until these physical actions succeed. (Of course, this can also backfire: if you ask your coffee machine and toaster to transactionally prepare a breakfast, when your toaster burns the bread, the coffee machine might feel obligated to undo your coffee. That's an example of a situation in which you shouldn't use transactions!)

In the final part of this series, we'll look at some of the failure conditions that plague real-life networks, and what transactions can do about them.

Undecided Voters, Deadlocks, and Other Partial Failure Evils

Undecided voters are a problem not only for presidential candidates, but also for the 2PC protocol. When the transaction manager calls prepare() on a participant, it expects to receive a PREPARED, ABORT, or NOTCHANGED vote. However, distributed transaction messages must travel through the network, which is inherently unreliable. Thus, the transaction manager might never receive a vote from one or more participants. In addition, one or more of the services might crash during the transaction. For these reasons the 2PC protocol cannot completely guarantee a transaction's commitment (it's sometimes called a weak commitment protocol).

Jini solves the problem of weak commitment with leases. Because a Transaction is a leased resource in Jini, its lease sooner or later expires. When that happens, the transaction manager causes the transaction to abort, calling abort() on all participants it can still contact.

Orphaned transactions are those that are guaranteed to abort. When a participant has already returned a vote, and is waiting for the manager's call to commit() or abort(), it can inquire about the transaction's current state by calling getState() on the transaction. If the transaction replies PREPARED or COMMITTED, the participant can then commit the work done during the transaction. On the other hand, if the manager returns ABORTED, the participant must then exit the transaction by calling abort(). If it cannot contact the transaction manager for a while, then it might decide that the manager crashed, and abort the transaction as well.

When several participants in a transaction compete for the same resources, deadlocks might occur. Recall that during the transaction, all participants must ensure the transaction guarantees (ACID). For example, the transaction isolation requirement mandates that the credit card service should place a lock on the credit card account for the transaction's duration. During this time no other services can access the account. If several services inside that transaction need access to the credit card account, then they need to somehow coordinate their activities so they are not all waiting for each other.

Thus the problems of concurrency control are magnified by the service-oriented Web. The more services that interact on the Web this way, the more chances there are for serious deadlocks. Without lease expirations causing deadlocked Jini transactions to eventually abort, deadlocks could bring the whole service-oriented Web to a halt.

Figure 6. Deadlocked services

In the absance of a central concurrency-control mechanism, one way for transactions to avoid deadlocks is to relax the isolation level, allowing some changes to become visible outside the transaction while the transaction is still in progress. For instance, by being able to read the account balance, other services can possibly determine whether a charge on the account will succeed when they eventually receive a "write" lock on it. Many real-life transaction-processing systems operate with less than full isolation levels to achieve increased transaction throughput.

The data management community has developed an entire repertoire of techniques and tricks to deal with this and related issues. Transactional services teach us that, increasingly, data management problems are becoming problems of distributed computing, and, likewise, distributed computing problems are becoming those of data management. This realization invites us to pursue a more interdisciplinary approach so as to bring about better-informed solutions to these exciting challenges ahead of us.

Words of Caution

Let me conclude this article with two notes of caution. First, while transactions are a useful tool to make a computation reliable, there is no magic to their effectiveness. Each service must ensure that it abides by its part of the guarantees the transaction is supposed to provide. How a service might do that is the subject of my next installment in this series.

Second, distributed transactions are expensive in terms of their computational resources. They involve a manager and many messages to facilitate the two-phase commit protocol. In addition, implementing a transaction participant that conforms to the default semantics is a significant undertaking, as you will see next month. However, when you do need guaranteed reliability for a distributed computation, there is no alternative to transactions.

Resources

Read Frank Sommer's complete "Survival of the Fittest Jini Services" series:
The Jini Transaction Specification:
http://www.sun.com/jini
/specs/jini1.1html/txn-spec.html">
http://www.sun.com/jini
/specs/jini1.1html/txn-spec.html
Transaction Processing: Concepts and Techniques, by Jim Gray and Andreas Reuter (Morgan Kaufmann, 1993; ISBN: 1558601902) is the most comprehensive book on transactions. With its thousand-plus pages chock-full of brilliant insights, it is not for the faint of heart, but is a very rewarding read. Gray and Reuter themselves invented many of the concepts presented in the book:
http://www.mkp.com/books_catalog/catalog.asp?ISBN=1-55860-190-2
Philip Bernstein's Transaction Processing for the Systems Professional (Morgan Kaufmann, 1996; ISBN: 1558604154) is an excellent introduction to transaction processing. It also gives a grand tour of all the significant transaction processing systems in use today:
http://www.mkp.com/books_catalog/catalog.asp?ISBN=1-55860-415-4
You can also download the full text of Philip Bernstein, Vassos Hadzilacos, and Nathan Goodman's Concurrency Control and Recovery in Database Systems from:
http://research.microsoft.com/pubs/ccontrol/
Gerhard Weikum and Gottfried Vossen's Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery (Morgan Kaufmann, 2001; ISBN: 1558605088) focuses on recent advances in transaction models and their formal analysis. It also discusses crash recovery techniques for object systems, which is very relevant to high-availability Jini services:
http://www.mkp.com/books_catalog/catalog.asp?ISBN=1-55860-508-8
Advances in the field of distributed and multidatabase management systems may prove helpful in gaining a better understanding of the interaction between multiple Jini services. Tamer Ozsu and Patrick Valduriez's Principles of Distributed Database Systems (Prentice Hall, 1999; ISBN: 0136597076) is excellent, and has a chapter devoted to distributed transaction processing:
http://vig.prenhall.com/catalog/academic/product/1,4096,0136597076.html,00.html
To better understand the issues of controlling concurrency in distributed systems, Nancy Lynch's Distributed Algorithms (Morgan Kaufmann, 1996; ISBN: 1558603484) is an excellent reference. It has a section on consensus algorithms (where many parts of a distributed system have to reach some sort of an agreement, and do this quickly, without eating up a lot of bandwidth), an entire chapter on atomic objects, and a discussion on partial failure (and what to do about it):
http://theory.lcs.mit.edu/tds/distalgs.html
My example of the generals' problem was inspired by Leslie Lamport's "Byzantine Generals" problem, which is described in his fascinating paper from 1982. Lamport's more difficult problem is, How can the generals reach an agreement if there is a traitor among them? The following Websites have information about the paper:
- http://www.research.compaq.com/SRC/personal/lamport/pubs/pubs.html#trans
- http://www.cs.cornell.edu/cs614-sp98/notes/byzantine.html
If you thought that Jini transactions were a complex beast, imagine when pieces of these transactions start to physically move about! Mobility brings a new set of challenges to distributed transactions. Evaggelia Pitoura and George Samaras's Data Management for Mobile Computing (Kluwer Academic, 1997; ISBN: 0792380533) is a good introduction:
http://www.wkap.nl/book.htm/0-7923-8053-3
Pitoura's Website has a wealth of information about the impact of mobility on distributed systems:
http://zeus.cs.uoi.gr/~pitoura/pub.html
If you are pressed for time, I wrote up a short outline for a tutorial on mobile transactions:
http://www.autospaces.com/people/fsommers/mobile_transactions.pdf
Panos Chrysanthis's homepage:
http://www.cs.pitt.edu/~panos/publications/all.html
Margaret Dunham's homepage (articles from the late 1990s; especially relevant is "A Mobile Transaction Model that Captures Both the Data and Movement Behavior"):
http://www.seas.smu.edu/~mhd/pubs.html
Tomasz Imielinski's mobile computing project page:
http://www.cs.rutgers.edu/dataman/
A complete listing of Jiniology articles:
http://www.javaworld.com/javaworld/topicalindex/jw-ti-jiniology.html
Bill Venners's Jini FAQ:
http://www.artima.com
The Jini community Website has pointers to all things related to Jini:
http://www.jini.org
The Sun Jini page contains success stories, whitepapers, and so forth on deploying Jini services in business environments:
http://www.sun.com/jini
Subscribe to the JavaWorld This Week weekly email newsletter to find out what's new on JavaWorld:
http://www.idg.net/jw-subscribe
You'll find a wealth of IT-related articles from our sister publications at IDG.net

"Survival of the Fittest Jini Services, Part II" by Frank Sommers was originally published by JavaWorld (www.javaworld.com), copyright IDG, July 2001. Reprinted with permission.
http://www.javaworld.com/javaworld/jw-07-2001/jw-0720-jiniology.html

Talk back!

Have an opinion? Be the first to post a comment about this article.

About the author

Frank Sommers is founder and CEO of AutoSpaces.com, a startup focusing on bringing Jini technology to the automotive software market. He also serves as vice president of technology at Nowcom Corp., an outsourcing firm based in Los Angeles. He has been programming in Java since 1995, after attending the first public demonstration of the language on the Sun Microsystems campus in November of that year. His interests include parallel and distributed computing, the discovery and representation of knowledge in databases, as well as the philosophical foundations of computing. When not thinking about computers, he composes and plays piano, studies the symphonies of Gustav Mahler, or explores the writings of Aristotle and Ayn Rand. Sommers would like to thank Jungwon Jin, aka Nugu, for her tireless support and unfailing belief.