The Artima Developer Community
Sponsored Link

Weblogs Forum
Messaging is the Right Way to Build a Distributed System

50 replies on 4 pages. Most recent reply: Aug 22, 2006 4:12 AM by D. A.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 50 replies on 4 pages [ 1 2 3 4 | » ]
Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Messaging is the Right Way to Build a Distributed System (View in Weblogs)
Posted: Jul 25, 2005 6:00 PM
Reply to this message Reply
Summary
A message-based design is fundamentally the right way to think about building a distributed system, as opposed to code sharing, remote procedure calls, and the like. This article explains why.
Advertisement

I read an article last year that said the complexity of J2EE resulted from the basic architectural designs it's intended for--architectures that rely on distributed code, where you pass objects around the network and invoke methods on objects that reside on other systems. The article's premise was that the best way to build a distributed system is to use messaging--to use a protocol-based design, rather than an API-based design. If I ever find that article again, I'll post a pointer to it. It was a good one. It got me thinking...

I knew J2EE was complex. For me, it was too complex. The simplifications and ease-of-use improvements coming up in the near future will hide a lot of that complexity so the developer doesn't have to deal with it (see J2EE Ease of Use)--but the complexity will still exist. So I've been wondering for a while now: What is it that makes a loosely-coupled, message-based design so much better?

I recently spent some time with Bill Venners and Frank Sommers of Artima. At the time, they were discussing that very question. Something about the conversation made my thoughts gel, and things sort of clicked into place. This post shares those thoughts.

To my mind, the factors that tend to make a message-based design clearly superior most of the time are:

Transparency #

When you're sending messages, it's easy to inspect them--so it's easy to see which messages that are being passed. That makes debugging a whole lot easier.

Testability #

To test a module, you send it a message. You can then inspect the message coming back to compare the results. You can do your unit testing properly, in other words, testing each module in isolation. In an object-sharing system, on the other hand, you have two choices. One choice is to wait until everything is running to start testing--which means you have a host of bugs waiting to be found, all of which are deep in the code and complex. That's horrible.

The other choice is to build a unit testing framework that lets you plug in a real module, surrounding it with a framework and feeding it mock objects you can use to test with. Setting all that up can be nearly as much work as the application itself.

Immunity and Evolution #

You can add additional content to an XML message at will. The receiving code doesn't even need to know about it. So you can upgrade the sending end of the system without necessarily affecting the receiving end. When you're passing objects around on the other hand, that's frquently impossible to do--unless the "objects" are simply messages in disguise as, for example, a hash map that contains key/value pairs.

You can even "immunize" the system against drastic change by providing a wrapper that directs messages in the old format to an older version of the module, and messages in a newer format to the new version of the module. That kind of implementation makes the system even more "loosely coupled", which lets you upgrade the system organically, one part of a time. That's an important aspect of a system design, whether you're a wholesaler with 120 clients scattered around the globe, or a corporate resource that's providing services to a dozen different departments. When you update your main system to provide new capabilities, you want let your clients to continue to access the old capabilities until they get around to upgrading. That gives your system the ability to evolve in a somewhat organic fashion.

In a distributed-object system, on the other hand, you're pretty well forced to upgrade everyone on the same timetable. Then you throw a giant switch and hope the new system works as well as the old one did--and that people will like it. (With the message-based system, they could always go back to the old way of doing things, if they needed to. With the distributed-object implementation, they don't get that choice.)

Interoperability and Evolution #

If you're passing XML messages around--which are essentially a glorifed version of ASCII text--there is no requirement whatever the different parts of the system be implemented in the same language, on the same operating system, or even on the same platform. There is no need particular need to make sure that the systems are "compatible", beyond a minimal ability to communicate. That's interoperability.

And if you decide you want to use a different language, operating system, or platform, you're free to do so. You can convert part of the system and stop there, or do the whole enchilada eventually--one module at a time. Once again, you're free to evolve the system in directions you choose.

Stateless Scalability #

Object-construction is relatively expensive. And object by it's very nature preserves state. So a distributed-object design will tend to use state-smart objects. But objects consume resources, and when you have a lot of state-smart objects, most of the time they're just sitting around waiting to be poked so they can take the next step, whatever that turns out to be. On top of that, you have the problem of distributed garbage collection. Garbage collection by itself is hard enough. Distributed garbage collection just has to be a nightmare.

If you're passing messages, on the other hand--for instance, XML messages, the "state" information the receiving object needs amounts to a fistful of extra bytes. So the objects doing the work on the receiving end don't have to record any state information. That's statelessness.

Although transmission time is marginally longer with such a design, the overall load on the system is drastically reduced. So the system tends to keep working when the transaction volume is immense. That's scalability.

Conclusion #

For a distributed system, a protocol-based, message-based design is almost uniformly an improvement over a distributed-object design, for good reason: transparency, testability, immunity and evolution, as well as stateless scalability. The NetBeans collaboration APIs described in JavaOne 2005, Day 3: Share the News! show how clean and simple a distributed system can be when it's message based.

Resources #


Peter Booth

Posts: 62
Nickname: alohashirt
Registered: Aug, 2004

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 25, 2005 7:09 PM
Reply to this message Reply
Agreed there's an appeal to the transparency of text based messages. But XML really isn't that human readable, and the downside of this is that it's a lowest common denominator, IDL re-envisioned approach.

Yes a very real downside of the remote object approach is that it couples
deployment of client and server. But the alternative is an approach that discourages refactoring / simplifying of of service APIs yet service APIs are in many ways the most important and the ones that we really should be able to refactor as we develop our understanding of the problem domain.

I think the XML messaging approach is wonderful for Amazon, or Ebay, where the service author has thousands of service users with disparate technology platforms. But when I come across this between systems in the same organization it is frequently more about firming up organizational boundaries than meeting any business or technical requirement. Conway's Law continues to be true, alas, http://www.jargon.net/jargonfile/c/ConwaysLaw.html

Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 5:30 AM
Reply to this message Reply
> Yes a very real downside of the remote object approach is
> that it couples
> deployment of client and server. But the alternative is an
> approach that discourages refactoring / simplifying of of
> service APIs yet service APIs are in many ways the most
> important and the ones that we really should be able to
> refactor as we develop our understanding of the problem
> domain.
>
Interesting point. The next version of J2EE will allow refactoring across the application--including Java beans, since they become Plain Old Java Objects (POJOs) in that version. As you indicate, there is benefit that can be derived from that.

It seems obvious that there must be some components of a system which *should* be strongly coupled. The alternative is a SmallTalk approach where everything is truly a "message". It's something like a stereo system. Any one module is a single unit (tightly coupled), but the modules are wired together (loosely coupled). Figuring out the module boundaries is the hard part. After that, it's messaging between modules and invocations within them.

As for XML--actually, it *is* human readable. That is its real strength. That's what makes it preferable to binary document formats (which tend to be proprietary if only because they're impenetrable). It rapidly superceded the binary messaging formats that preceded it, largely because it is transparent both to humans (readable) and computers (parsable).

XML's one severe deficiency from the standpoint of generalized document editing is the inability to identify different entities as either "inline" or"structural", and to specify style sheet properties for inline entities. Were that ommission rectified, XML would soon replace plain text.

BTW: One of the interesting things in the Eclipse world that I never got around to reporting from JavaOne 2005 was the SSE intitiative--the Standard for Structured Editing. Personally, I think that's an idea who's time is overdue, and I'm glad to see things moving in that direction.

Jared Richardson

Posts: 1031
Nickname: jaredr
Registered: Jun, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 6:22 AM
Reply to this message Reply
>As for XML--actually, it *is* human readable.

Agreed. I've found that most people who dislike XML tend to write what I'm calling "inline XML" as opposed to properly formatted XML. They write the equivalent of obfuscated Perl code and then complain that it's not readable.

Great post.

todd hoff

Posts: 22
Nickname: thoff
Registered: Jan, 2004

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 6:46 AM
Reply to this message Reply
Protocols are the right way to build a distibutes system, but APIs are the right way to use a distrubted system. The first thing people will do is build an API over the protocol to make it easier for programmers to use. So I do both. Make the protocol message based AND provide a programmer API.

Steve Green

Posts: 1
Nickname: stevie6
Registered: Jul, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 8:22 AM
Reply to this message Reply
While I agree that Messaging is a very useful way to access remote functionality (services seems to the the buzzword at the moment) I think that using messaging brings its own problems.

The main problem that I come across is to do with transactions. I have been involved in a lot of systems that used messaging (both as a designer/architect of these systems and as a reviewer of other people's designs, and, a while ago now as a developer) and the bit that people always have difficulty understanding is how to deal properly with failure.

If you have a synchronous call (rpc or whatever) then it will either work or it won't (assuming no bugs - but that's another story). In Java it will throw an exception if it doesn't work. In the old days of C the return value would tell you. If it doesn't work then you have a choice - you can either retry or you can put the object that did the call into some sort of error state and leave it for someone or something else to sort out.

If however you are using an asynchronous messaging system then the number of failure scenarios are much greater: Maybe the message didn't get sent, maybe it got sent but you don't know it, maybe it got sent but not received, and so on and so on right through the loop from sender to receiver back to sender. To deal with this in a sensible way you really need a state machine so that your system can deal with resends, replies it didn't expect, duplicate replies etc. And of course the receiving end (the service you are calling) needs one too. This is fine if your object making the call has a state machine, or is part of a workflow anyway (which it often will be) but it is an overhead that needs to be considered if not.

Another thing to consider is if you are using JMS as your messaging service then you have all the problems involved with using an api to a remote service anyway.

I'm not saying that Messaging is a bad thing and shouldn't be used. As I said before, I have both designed and reviewed systems that make extensive use of messaging. All I'm saying is getting it right isn't easy and making developers understand the complexities and pitfalls is not easy either.

Maarten Hazewinkel

Posts: 32
Nickname: terkans
Registered: Jan, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 9:03 AM
Reply to this message Reply
> If however you are using an asynchronous messaging system
> then the number of failure scenarios are much greater:
> Maybe the message didn't get sent, maybe it got sent but
> you don't know it, maybe it got sent but not received, and
> so on and so on right through the loop from sender to
> receiver back to sender. To deal with this in a sensible
> way you really need a state machine so that your system
> can deal with resends, replies it didn't expect, duplicate
> replies etc. And of course the receiving end (the service
> you are calling) needs one too. This is fine if your
> object making the call has a state machine, or is part of
> a workflow anyway (which it often will be) but it is an
> overhead that needs to be considered if not.

This is eased greatly by using transactional message services. If they accept a message, you can count on it being delivered eventually. You don't know exactly when, especially in the face of severe system failures, but unless your message store database gets totally fubar-ed, it will be delivered.
And if one of your databases does get wiped, that will leave you in the deep in any system architecture.

Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 10:36 AM
Reply to this message Reply
> Protocols are the right way to build a distibutes system,
> but APIs are the right way to use a distrubted system. The
> first thing people will do is build an API over the
> protocol to make it easier for programmers to use. So I do
> both. Make the protocol message based AND provide a
> programmer API.

>
What a fascinating idea. APIs and protocols, without
actually distributing any objects. Brilliant.

Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 10:38 AM
Reply to this message Reply
> ...using an asynchronous messaging system
> then the number of failure scenarios are much greater:
> Maybe the message didn't get sent, maybe it got sent but
> you don't know it, maybe it got sent but not received, and
> so on and so on right through the loop from sender to
> receiver back to sender.
>
Todd Hoff's suggestion fits right in, here. The wrapper around the protocol does any receipt-verification and retrying that's required, and eventually generates an exception when it gives up.

Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 10:40 AM
Reply to this message Reply
> > If however you are using an asynchronous messaging
> system
> > then the number of failure scenarios are much greater:
>
> This is eased greatly by using transactional message
> services. If they accept a message, you can count on it
> being delivered eventually...
>
Aha. Which have you used, or look interesting enough to recommend?

Peter Booth

Posts: 62
Nickname: alohashirt
Registered: Aug, 2004

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 26, 2005 8:39 PM
Reply to this message Reply
> > Protocols are the right way to build a distibutes
> system,
> > but APIs are the right way to use a distrubted system.
> The
> > first thing people will do is build an API over the
> > protocol to make it easier for programmers to use. So I
> do
> > both. Make the protocol message based AND provide a
> > programmer API.
>
> >
> What a fascinating idea. APIs and protocols, without
> actually distributing any objects. Brilliant.

Isn't that the first law of distributed programming? Avoid distribution where possible. I have seen many more "over distributed" systems than I have under-distributed perhaps because some believe that an extra tier will magically "add scalability."

Claudio Lyra

Posts: 2
Nickname: clyra
Registered: Jul, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 27, 2005 1:02 AM
Reply to this message Reply
Hi,

I really liked the concepts of immunity and organic evolution. I never heard those words before (I think I have to visit the ServerSide.com web site more frequently) but the ideas encapsulated on them make a lot of sense. Thanks for elevating my vocabulary!

Also, I also found great when you put that “some objects are in fact messages in disguise”, like hash maps that contains key/value pairs. My experience at work is exactly with a kind of system like this, where all modules communicate by means of hash tables. You debug every logical unit (or service) just by confronting the input and output “document” objects. In other words, you don’t need any “environment debugger” – a tool able to plug in the development engine and read the entity objects. Instead, any way available to intercept the messages is good enough! Just catch the messages (for real messages, how you do it depends on the underlying communication protocol; for messages in disguise, you can serialize the objects) and inspect them. No need of mocks objects or container frameworks!

The only remark I have to add is regarding your comment: “In a distributed-object system, on the other hand, you're pretty well forced to upgrade everyone on the same timetable”. Even for an API-based middleware, where the objects are “fat”, as long as the interfaces are respected, you still can write wrappers to mold the objects to the old format.

While I was writing down this comment I thought about what I was writing and I think I maybe realize your point: it is pretty easy to be in a situation where you are forced to alter the interfaces for a given fat object. There is where the wrappers go to hell. On the other hand, looking from this angle, messaging is a superior architectural choice.

Regards,
Claudio Lyra
bits and bytes cruncher, a.k.a. Software Engineer

Claudio Lyra

Posts: 2
Nickname: clyra
Registered: Jul, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 27, 2005 1:27 AM
Reply to this message Reply
After posting my own message I went through the other messages and I found very interesting points.

Two things:
a) First, on the concerns with transactionality, retries, logging and the alike: messaging does need an "application server". It does not come for free, either you build your own or you choose one from a vendor. The benefits of messaging will show up after that. In a corporate scenario (Conway's law), to keep an inventory of existing API libraries or message formats are both similarly difficult. I don't know a golden formula to solve it.

b) To develop both an API and a protocol based layers seems to be overkill. Anyway, each case is a case, and it might be necessary in specific situations. Anyway, it sounds more like the job for a "middleware vendor" than for regular "IT department" teams to do.

finally {
Someone has mentioned simplicity: it reminded of the Occam's razor: when faced with options choose the simplest one.
It also came up to my mind one of the XP tenets: don't add bells and whitles unless extremely necessarily. Keep the solution at the minimum.
}

Regards

José Ghislain Quenum

Posts: 2
Nickname: princeq128
Registered: May, 2005

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 27, 2005 2:42 AM
Reply to this message Reply
Using protocols instead of APIs is an easier way to publish contracts between a collection of objects.

For example, protocols are one of the techniques people use to design interactions between agents in a multi-agent system and it really scales to distributed and open systems.

There are still theoretical issues to address but using protocols will better do than API since you deal with autonomous entities whether active objects or agents

Eric Armstrong

Posts: 207
Nickname: cooltools
Registered: Apr, 2003

Re: Messaging is the Right Way to Build a Distributed System Posted: Jul 27, 2005 7:24 AM
Reply to this message Reply
> finally {
> Someone has mentioned simplicity: it reminded of the
> the Occam's razor: when faced with options choose the
> simplest one.
> It also came up to my mind one of the XP tenets: don't
> n't add bells and whitles unless extremely necessarily.
> Keep the solution at the minimum.
>
Right you are. I owe the agile guys big time for making a mantra out of this one. If I had back all the time I've spent coding "somebody might need this" features that never got used--and for which the hardware no longer exists even to run the program--I'd have been able to do something useful with my life...

Flat View: This topic has 50 replies on 4 pages [ 1  2  3  4 | » ]
Topic: Messaging is the Right Way to Build a Distributed System Previous Topic   Next Topic Topic: Upcoming Appearances

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use