The Artima Developer Community
Sponsored Link

Designing Distributed Systems
A Conversation with Ken Arnold, Part III
by Bill Venners
October 23, 2002

<<  Page 2 of 4  >>



So what are those recovery strategies? J2EE (Java 2 Platform, Enterprise Edition) and many distributed systems use transactions. Transactions say, "I don't know if you received it, so I am forcing the system to act as if you didn't." I will abort the transaction. Then if you are down, you'll come up a week from now and you'll be told, "Forget about that. It never happened." And you will.

Transactions are easy to understand: I don't know if things failed, so I make sure they failed and I start over again. That is a legitimate, straightforward way to deal with failure. It is not a cheap way however.

Transactions tend to require multiple players, usually at least one more player than the number of transaction participants, including the client. And even if you can optimize out the extra player, there are still more messages that say, "Am I ready to go forward? Do you want to go forward? Do you think we should go forward? Yes? Then I think it's time to go forward." All of those messages have to happen.

And even with a two-phase commit, there are some small windows that can leave you in ambiguous states. A human being eventually has to interrupt and say, "You know, that thing did go away and it's never coming back. So don't wait." Say you have three participants in a transaction. Two of them agree to go forward and are waiting to be told to go. But the third one crashes at an inopportune time before it has a chance to vote, so the first two are stuck. There is a small window there. I think it has been proven that it doesn't matter how many phases you add, you can't make that window go away. You can only narrow it slightly.

So the transactions approach isn't perfect, although those kinds of problems happen rarely enough. Maybe instead of ten to the thirteenth, the probability is ten to the thirtieth. Maybe you can ignore it, I don't know. But that window is certainly a worry.

The main point about transactions is that it has overhead. You have to create the transaction and you have to abort it. One of the things that a container like J2EE does for you is that it hides a lot of that from you. Most things just know that there's a transaction around them. If somebody thinks it should be aborted, it will be aborted. But most things don't have to participate very directly in aborting the transaction. That makes it simpler.

<<  Page 2 of 4  >>

Sponsored Links

Copyright © 1996-2018 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use