Turn, and Face the Strange
Is Distribution really that bad?
by Calum Shaw-Mackay
April 14, 2004

Summary
In my inaugural weblog entry here on Artima, I question whether the logic about the First Law of Distributed Computing is really that relevant now.

Ha! I thought Id throw out a really leftfield, contentious issue for my first weblog.

I googled First law of distributed and came up with a few hits, as I suspected, for Martin Fowlers definition. If I have to be the only person, to argue around this (not necessarily disagree, well, not in all cases), well so be it.

The first issue that springs to mind is: if you call something the First, you tend to want to expect it to be followed with one of Second, Next or Last couldnt find anything on those, so perhaps it should be renamed the Only Law of Distributed Computing.

Aside from being pedantic, my main issue, however, is whether distribution is always that bad. You understand where the law comes from, when the qualification goes along the lines of if you dont have to distribute, dont, because it costs.

I really do understand this when youre writing a system that has to do hundreds of transactions per second, but if youre not then is distribution such a big problem.

Facts of Life

My point comes from the fact that mankind has had to deal with distribution and its associated costs for millennia weve never had everything that we need, during the entirety of our lives, right next to us.

If your car is running low on petrol, you go and get some more its highly unlikely that every person who has a car has a pump to fill up their car, just sitting there in their garden. When we need food, we might walk to the shop and buy some this implies distribution and an associated cost, the energy consumed to walk to the shop and back. In ages past, if you didnt hunt you didnt eat, yet the act of hunting consumed resources.

Most businesses have suppliers that are many miles away, perhaps even on other continents and may serve customer that are also miles away distribution doesnt stop them.

In other words, I believe distribution, and the cost of it, to actually be a natural entity, so why in computing do we tirelessly fight against it? As with many things in life, it all comes down to a matter of perspective.

The general consensus about why distributed objects are a Bad Thing, is that its more efficient to do things locally. Why take 20ms to do something between machines when it will only take 20ns to do it locally in a single VM?

Well, to go back to my previous analogies, would you want your house to actually be filled with every possible amenity you need in your life? Aisles of food from your local supermarket, coupled with the obligatory petrol pump in your back garden, and your desk and PC from your office? You wouldnt have enough space to have the contents of your entire life sat at the side of you.

Now back to the computing world I might be able to run my entire software architecture out of one VM, but hey it might be slow as hell, but...at least its local.

Another point that always makes me wonder is that most of the people who quote the First Law, quote it on the Internet, the biggest public distribution network in the world. Why? Because they want their opinion heard; thus distribution. But voicing this comment is not without cost - they have to log on to a site, load in the pages, write their comment in an HTML form, submit it, etc.

So the point of distribution should not be if its more efficient to do it locally, then dont distribute it, it should be if it is more beneficial to distribute then do it., or if the cost of distributing is less than the benefit gained whats the problem?

You see, from my standpoint of doing systems integration for the majority of my work, I just cant move my database, workflow and mainframe on to one box, and happily let them co-exist.

If it takes the human eye between 300 and 400 milliseconds to blink, do you think it really matters to me that a Jini service call via RMI may only take 20ms to complete? If a user is willing to wait 500ms for a response, I could do 25 remote calls, before that user is going to become even slightly aware of the cost of the function being performed.

Facts of Reality

Distribution, both in distributed systems and distributed objects, is a fact of system reality, whether we like it or not. And many more people work in a world where processes and systems have to integrate, than those that can build green-field systems or where everything is in one box.

Ambiguity is an undercurrent of linguistics. For example Doing something remotely is orders of magnitude slower than doing something in-process. Oh that tells me its slower. Its like the old argument, "C++ is x% faster than Java". Give me the real-time numbers for my given situation, because only that will tell me whether it really matters.

Performance is of course an issue, but only if it is in fact an issue in the scenario, dont just use the Performance card to dismiss the benefits of distribution offhandedly

What is happening now?

I never really get that distribution is only okay if you have a web browser that talks to an application server (running all your middle-tier) that connects to a database. I see talk of Service Oriented Architectures mainly with Web Services, which is okay for others Ive been doing SOA with Jini since 2000, and it hasnt seen me wrong yet, but your mileage may vary.

But a key differentiator with SOA, and grids, is that it is distributed (it doesnt have to be, but it kind of defeats the purpose), each application that provides a service can be on a different box, the registry that allows connections to these services is also distributed (in general).

So if the next generation of applications and architectures that are being touted are distributed whats made people change their minds, and seeing that they are changing their minds, dont you think that people need to actually think about distribution in a rational manner, rather than just dismissing it out of hand?

Talk Back!

Have an opinion? Readers have already posted 15 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Calum Shaw-Mackay adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Calum Shaw-Mackay is an architect on Java and Jini systems, working in the UK. His interests lie in distributed computing, adaptability, and abstraction. Calum has been using Jini for longer than he would care to mention. His main area for taking the blame (some people would call it 'expertise') is systems integration and distributed frameworks, and is an advocate of using Jini's unique strengths to build adaptable enterprise systems. His opinions are his own. He's tried to get other people to take his opinions off him, but they just won't.


	Web Artima.com