Summary
In a SOA world, "dynamic scalability" is an architectural concern that dictates the terms of the design to which it is ascribed.
Advertisement
When is scalability more than just an architectural "capability" and a fundamental aspect of the design?
In the new SOA world, scalability must be addressed at the outset. No vendor's all-enabling configuration screen will cross this architectural chasm for you. This is because SOA architectures, in particular grid architectures, work well as architectures completely contrary to current day methodologies. This is mainly centered around the aversion of enterprise "state." So-called share-nothing architectures and modern day languages allude to this requirement by making it darned uncomfortable or outright impossible to share context state in any way other than message passing.
Languages like Scala and Haskell and Erlang reflect this: in those languages programming by side-effects is very cumbersome. Some architectures take this to the extreme, demoting the semaphore required to implement a transactional lock a luxury, rather than a necessity. Witness the "BASE" architectural style, wherein architectures are built to be "eventually consistent," essentially embracing - with some fallbacks in place - inconsistent views of state as part of the day-to-day workings of an architecture.
Architecting for Scalability is a drastic departure from "normal" development. Naturally, there's always going to be some cohesion, even in the most loosely coupled of components. This cohesion is a factor of the friction incurred in communicating with other components.
Case in point:
Grid processing is efficient except to the extent that network latency causes chatter. Say for example the reduce step in Map/Reduce.
Network file systems can be mind numbingly fast SSDD or RAID, they're only as fast as the mesh that bridges them, which might be something like fiber or Gigabit.
Data clusters are only as fast as their replication mechanism. Terracotta has an edge here, as it broadcasts Java Virtual Machine-level memory deltas, instead of invalidating entire object graphs and forcing a reread.
You can't test a proper grid using the same metrics as before. At best, you can extrapolate. It's almost impossible (certainly infeasible) to effectively "stage" a grid infrastructure "reproducing 1,000 nodes. What happens when Production expands to 10,000 nodes?
It is - and has been - true that you should architect for the worst load your application is likely to incur. If this is right during the holidays, then architect for the holiday rush. If this is at the New Year when business starts again, then architect for that load. The performance profile of your application may be very different under those circumstances. Cloud infrastructure like Amazon's EC2 can help with this by mitigating the cost of expanding infrastructure. To use things like EC2, you need to architect for almost zero-cohesion among your components. It should be as easy - with a trivial degradation in performance per node added - to double your capacity. This quality - of being able to expand with almost no impact on the architecture - is called "dynamic scalability."
The enterprise world has long dealt with the notion of dynamic scalability on scales so small as to be unrecognizable compared to today's grid topologies. One example of this is with messaging. The classic hub-spoke architecture of, for example, a JMS Topic whose messages are popped off and processed by clients is a perfect example. Scalability is achieved - to an extent - by merely adding new clients. There seems to be a general movement towards multi-threading. This is a positive - if shortsighted - step. It is a microcosm of the same driving forces behind the distributed computing advancements of tomorrow: Scalability is not an architectural "capability" anymore.
Grid processing is efficient except to the extent that network latency causes chatter. Say for example the reduce step in Map/Reduce.
Very true. I'd take this a step farther and question whether Map/Reduce is even a decent abstraction for distributed programming to begin with. To me, while it may be workable for processing large files in a global file system, it seems too low-level to be used as a general purpose abstraction.
Data clusters are only as fast as their replication mechanism.
No. Looking at data in a clustered environment in the same way that you look at a client/server database model is a losing proposition to start with. In a large-scale environment, if you have to move the data, you lose. (Hence why Map/Reduce exists in the first place, for example.)
Terracotta has an edge here, as it broadcasts Java Virtual Machine-level memory deltas, instead of invalidating entire object graphs and forcing a reread.
Again, you're talking about client/server technology. Terracotta is a server (basically a "remote heap") that a whole bunch of JVMs attach to in order to coordinate their work. Suffice to say that a shared-everything heap is a scalable solution.
You can't test a proper grid using the same metrics as before. At best, you can extrapolate. It's almost impossible (certainly infeasible) to effectively "stage" a grid infrastructure "reproducing 1,000 nodes. What happens when Production expands to 10,000 nodes?
Manageability of even a few hundred nodes is extremely difficult with traditional tools. I work on the Oracle Coherence product, which powers many of the large ecommerce and other transactional web sites, so I get to see a lot of deployment environments in the thousands and tens of thousands of servers, and management of these large scale environments is still the biggest challenge out there. Various vendors will tell you stories about infrastructure management nirvana, but they're flat out lying (and the bigger the claims, the more likely that they don't have any actual / significant deployments).
It is - and has been - true that you should architect for the worst load your application is likely to incur.
While that is true, you also have to pick your bottlenecks. As I tell people, you either pick your bottlenecks or they pick you. In other words, when you turn up the load, you're going to hit a bottleneck, so make sure you design so that bottleneck is occurring in the type of resource that you can actually scale.
For EC2 (as one example), that means that you need to bottleneck on virtual CPU and RAM, since the network is relatively unpredictable, yet you can add a large number of VMs quite easily.