Cool Tools and Other Stuff
Messaging is the Right Way to Build a Distributed System
by Eric Armstrong
July 26, 2005

Summary
A message-based design is fundamentally the right way to think about building a distributed system, as opposed to code sharing, remote procedure calls, and the like. This article explains why.

I read an article last year that said the complexity of J2EE resulted from the basic architectural designs it's intended for--architectures that rely on distributed code, where you pass objects around the network and invoke methods on objects that reside on other systems. The article's premise was that the best way to build a distributed system is to use messaging--to use a protocol-based design, rather than an API-based design. If I ever find that article again, I'll post a pointer to it. It was a good one. It got me thinking...

I knew J2EE was complex. For me, it was too complex. The simplifications and ease-of-use improvements coming up in the near future will hide a lot of that complexity so the developer doesn't have to deal with it (see J2EE Ease of Use)--but the complexity will still exist. So I've been wondering for a while now: What is it that makes a loosely-coupled, message-based design so much better?

I recently spent some time with Bill Venners and Frank Sommers of Artima. At the time, they were discussing that very question. Something about the conversation made my thoughts gel, and things sort of clicked into place. This post shares those thoughts.

To my mind, the factors that tend to make a message-based design clearly superior most of the time are:

Transparency
Testability
Immunity & Evolution
Interoperability & Evolution
Stateless Scalability

Transparency #

When you're sending messages, it's easy to inspect them--so it's easy to see which messages that are being passed. That makes debugging a whole lot easier.

Testability #

To test a module, you send it a message. You can then inspect the message coming back to compare the results. You can do your unit testing properly, in other words, testing each module in isolation. In an object-sharing system, on the other hand, you have two choices. One choice is to wait until everything is running to start testing--which means you have a host of bugs waiting to be found, all of which are deep in the code and complex. That's horrible.

The other choice is to build a unit testing framework that lets you plug in a real module, surrounding it with a framework and feeding it mock objects you can use to test with. Setting all that up can be nearly as much work as the application itself.

Immunity and Evolution #

You can add additional content to an XML message at will. The receiving code doesn't even need to know about it. So you can upgrade the sending end of the system without necessarily affecting the receiving end. When you're passing objects around on the other hand, that's frquently impossible to do--unless the "objects" are simply messages in disguise as, for example, a hash map that contains key/value pairs.

You can even "immunize" the system against drastic change by providing a wrapper that directs messages in the old format to an older version of the module, and messages in a newer format to the new version of the module. That kind of implementation makes the system even more "loosely coupled", which lets you upgrade the system organically, one part of a time. That's an important aspect of a system design, whether you're a wholesaler with 120 clients scattered around the globe, or a corporate resource that's providing services to a dozen different departments. When you update your main system to provide new capabilities, you want let your clients to continue to access the old capabilities until they get around to upgrading. That gives your system the ability to evolve in a somewhat organic fashion.

In a distributed-object system, on the other hand, you're pretty well forced to upgrade everyone on the same timetable. Then you throw a giant switch and hope the new system works as well as the old one did--and that people will like it. (With the message-based system, they could always go back to the old way of doing things, if they needed to. With the distributed-object implementation, they don't get that choice.)

Interoperability and Evolution #

If you're passing XML messages around--which are essentially a glorifed version of ASCII text--there is no requirement whatever the different parts of the system be implemented in the same language, on the same operating system, or even on the same platform. There is no need particular need to make sure that the systems are "compatible", beyond a minimal ability to communicate. That's interoperability.

And if you decide you want to use a different language, operating system, or platform, you're free to do so. You can convert part of the system and stop there, or do the whole enchilada eventually--one module at a time. Once again, you're free to evolve the system in directions you choose.

Stateless Scalability #

Object-construction is relatively expensive. And object by it's very nature preserves state. So a distributed-object design will tend to use state-smart objects. But objects consume resources, and when you have a lot of state-smart objects, most of the time they're just sitting around waiting to be poked so they can take the next step, whatever that turns out to be. On top of that, you have the problem of distributed garbage collection. Garbage collection by itself is hard enough. Distributed garbage collection just has to be a nightmare.

If you're passing messages, on the other hand--for instance, XML messages, the "state" information the receiving object needs amounts to a fistful of extra bytes. So the objects doing the work on the receiving end don't have to record any state information. That's statelessness.

Although transmission time is marginally longer with such a design, the overall load on the system is drastically reduced. So the system tends to keep working when the transaction volume is immense. That's scalability.

Conclusion #

For a distributed system, a protocol-based, message-based design is almost uniformly an improvement over a distributed-object design, for good reason: transparency, testability, immunity and evolution, as well as stateless scalability. The NetBeans collaboration APIs described in JavaOne 2005, Day 3: Share the News! show how clean and simple a distributed system can be when it's message based.

Resources #

J2EE Ease of Use
http://www.artima.com/weblogs/viewpost.jsp?thread=117184#J2EEeaseOfUse
NetBeans collaboration APIs: JavaOne 2005, Day 3: Share the News!
http://www.artima.com/weblogs/viewpost.jsp?thread=117079

Talk Back!

Have an opinion? Readers have already posted 50 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Eric Armstrong adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Eric Armstrong has been programming and writing professionally since before there were personal computers. His production experience includes artificial intelligence (AI) programs, system libraries, real-time programs, and business applications in a variety of languages. He works as a writer and software consultant in the San Francisco Bay Area. He wrote The JBuilder2 Bible and authored the Java/XML programming tutorial available at http://java.sun.com. Eric is also involved in efforts to design knowledge-based collaboration systems.


	Web Artima.com