Full contact Java Programming
Scalability Is Not An Architectural Capability Anymore
by Josh Long
February 2, 2009

Summary
In a SOA world, "dynamic scalability" is an architectural concern that dictates the terms of the design to which it is ascribed.

When is scalability more than just an architectural "capability" and a fundamental aspect of the design?

In the new SOA world, scalability must be addressed at the outset. No vendor's all-enabling configuration screen will cross this architectural chasm for you. This is because SOA architectures, in particular grid architectures, work well as architectures completely contrary to current day methodologies. This is mainly centered around the aversion of enterprise "state." So-called share-nothing architectures and modern day languages allude to this requirement by making it darned uncomfortable or outright impossible to share context state in any way other than message passing.

Languages like Scala and Haskell and Erlang reflect this: in those languages programming by side-effects is very cumbersome. Some architectures take this to the extreme, demoting the semaphore required to implement a transactional lock a luxury, rather than a necessity. Witness the "BASE" architectural style, wherein architectures are built to be "eventually consistent," essentially embracing - with some fallbacks in place - inconsistent views of state as part of the day-to-day workings of an architecture.

Architecting for Scalability is a drastic departure from "normal" development. Naturally, there's always going to be some cohesion, even in the most loosely coupled of components. This cohesion is a factor of the friction incurred in communicating with other components.

Case in point:

Grid processing is efficient except to the extent that network latency causes chatter. Say for example the reduce step in Map/Reduce.
Network file systems can be mind numbingly fast SSDD or RAID, they're only as fast as the mesh that bridges them, which might be something like fiber or Gigabit.
Data clusters are only as fast as their replication mechanism. Terracotta has an edge here, as it broadcasts Java Virtual Machine-level memory deltas, instead of invalidating entire object graphs and forcing a reread.
You can't test a proper grid using the same metrics as before. At best, you can extrapolate. It's almost impossible (certainly infeasible) to effectively "stage" a grid infrastructure "reproducing 1,000 nodes. What happens when Production expands to 10,000 nodes?

It is - and has been - true that you should architect for the worst load your application is likely to incur. If this is right during the holidays, then architect for the holiday rush. If this is at the New Year when business starts again, then architect for that load. The performance profile of your application may be very different under those circumstances. Cloud infrastructure like Amazon's EC2 can help with this by mitigating the cost of expanding infrastructure. To use things like EC2, you need to architect for almost zero-cohesion among your components. It should be as easy - with a trivial degradation in performance per node added - to double your capacity. This quality - of being able to expand with almost no impact on the architecture - is called "dynamic scalability."

The enterprise world has long dealt with the notion of dynamic scalability on scales so small as to be unrecognizable compared to today's grid topologies. One example of this is with messaging. The classic hub-spoke architecture of, for example, a JMS Topic whose messages are popped off and processed by clients is a perfect example. Scalability is achieved - to an extent - by merely adding new clients. There seems to be a general movement towards multi-threading. This is a positive - if shortsighted - step. It is a microcosm of the same driving forces behind the distributed computing advancements of tomorrow: Scalability is not an architectural "capability" anymore.

Talk Back!

Have an opinion? Readers have already posted 2 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Josh Long adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Josh Long (http://www.joshlong.com) is an enterprise architect, consultant, and author. When he's not hacking on code, he can be found at the local Java User Group or at the local coffee shop. Josh likes solutions that push the boundries of the technologies that enable them. His interests include scalability, BPM, grid processing, mobile computing and so-called "smart" systems.


	Web Artima.com