Article Discussion
From JavaOne 2009: Load-Testing Clustered Applications
Summary: Clustered applications scale in part by relying on the clustering environment to distribute workload. As a result, load-testing a clustered application must also test how well the clustering infrastructure handles growing workloads. Virtualization can make that task simpler, since virtual server instances allow you to mimic a large cluster environment on a handful of physical nodes. However, as Ari Zilka, co-founder of Terracotta, points out in this interview with Artima, stateful applications need special attention during load-testing, since the clustering infrastructure also needs to scale the management of application state. In this interview, Ari Zilka discusses various techniques to load-test a stateful application on a cluster, and highlights features of the open-source Terracotta clustering environment that make such load-testing even simpler. Finally, he talks about heavily disk-bound workloads, and how Terracotta FX can make them scale better on a cluster:
5 posts on 1 page.      
« Previous 1 Next »
The ability to add new comments in this discussion is temporarily disabled.
Most recent reply: June 17, 2009 7:17 PM by Cameron
Frank
Posts: 135 / Nickname: fsommers / Registered: January 19, 2002 7:24 AM
From JavaOne 2009: Load-Testing Clustered Applications
June 10, 2009 7:00 AM      
In this interview, Terracotta co-founder Ari Zilka discusses various techniques to load-test a stateful application on a cluster, and highlights features of the open-source Terracotta clustering environment that make such load-testing even simpler:

http://www.artima.com/lejava/articles/javaone_2009_ari_zilka.html

What do you think of Terracotta's approach to load-testing clustered applications?
Cameron
Posts: 26 / Nickname: cpurdy / Registered: December 23, 2004 0:16 AM
Re: From JavaOne 2009: Load-Testing Clustered Applications
June 11, 2009 7:16 AM      
Considering some of the positive things that they've accomplished, Terracotta's need to bad-mouth others ("FUD") is always a bit surprising. Or maybe it's just Zilka ..

At any rate, regardless of vendor claims, anyone who is foolish enough to _assume_ that the performance and throughput that they obtain with two nodes will automatically scale linearly to 100 nodes deserves exactly what they get .. ;-)

Peace,

Cameron Purdy | Oracle
http://coherence.oracle.com/
Frank
Posts: 135 / Nickname: fsommers / Registered: January 19, 2002 7:24 AM
Re: From JavaOne 2009: Load-Testing Clustered Applications
June 11, 2009 10:21 AM      
Hi Cameron,

I would be interested in how others handle heavily I/O-bound jobs in a clustered environment: What if most of your queries, for instance, result in cache misses and need to hit the disk?

Also, in this interview, Ari's example was a 100GB database. In another Artima interview, Gil Tene from Azul suggests that you load your 100GB dataset into the memory of a single JVM, which can be accomplished using their special-purpose JVM.

I have two questions about this:

1. How would you approach this problem if you had a TB dataset? And if you had petabyte of data?

Loading an entire dataset in RAM doesn't seem to take advantage of memory hierarchies, but simply pushes the problem into the most expensive type of memory. What would be the role of flash memory, for instance, in this sort of application (which is cheaper than RAM)?


2. If the data is in an RDBMS, most RDBMS products are pretty good at caching in memory lots of data already (just as they're very good at taking advantage of multicore CPUs in querying that data, which is a topic for another discussion).

I haven't yet worked with a 100GB dataset, but I recently worked with about 70GB. I was able to configure the RDBMS (in that case, MS SQL Server, but I'm sure the same can be done with Oracle or DB2, etc), to service queries mostly from RAM, and I got extremely good performance. Of course, I still incur an O/R mapping overhead, but the application performed very well under heavy load. And I didn't have to worry about VM garbage collection, etc.

That, of course, wouldn't work with a TB or more of data, since the RAM price would be costly.

Currently, what are the rules of thumb in terms of when to consider caching data external to the database (what datasize), etc)? Why not (when / when not) simply use a parallel RDBMS and configure it to cache lots of data? (I'm paraphrasing the question Stonebraker, DeWitt, et al. posed in a recent SIGMOD paper.)


Any suggestions would be appreciated.
Jags
Posts: 1 / Nickname: jramnara / Registered: June 15, 2009 8:09 AM
Re: From JavaOne 2009: Load-Testing Clustered Applications
June 15, 2009 1:26 PM      
Hi frank,
your question is to cameron, but, I am sure you don't mind others chiming in :-)

>
> I have two questions about this:
>
> 1. How would you approach this problem if you had a TB
> dataset? And if you had petabyte of data?
>
Obviously, even a TB in memory is cost prohibitive let alone a petabyte. The single biggest practical challenge you run into when dealing with a very large data set is that your deployment has to cope with a possibility where the entire cluster goes down. Can you imagine recovering a TB from some DB into your likely large cluster? The so called continuously availability story becomes a myth.

GemFire (https://community.gemstone.com/display/gemfire/GemFire+Enterprise), we use a "shared nothing" disk storage model where each node in the grid can also persist to disk with LRU eviction, so that the most frequently used data is always in memory. This obviously works well irrespective of how much memory you have or can afford and one can dynamically add more capacity (memory, cpu) on the fly and rebalance to take advantage.

> Loading an entire dataset in RAM doesn't seem to take
> advantage of memory hierarchies, but simply pushes the
> problem into the most expensive type of memory. What would
> be the role of flash memory, for instance, in this sort of
> application (which is cheaper than RAM)?
>
there is now talk about DDR2 addressabe flash memory, but, for now your file system on each node in the cluster had flash accessible then the overflow to disk and access on a miss can hopefully be accelerated.

GemFire uses an somewhat novel approach to avoid disk seek times - we simply append all modifications to disk files and asynchronously coalesce these files and we can do this on each node in your cluster dramatically increasing your throughput. You can read on this in this white paper --> https://community.gemstone.com/display/gemfire/EDF+Technical+White+Paper


>
> 2. If the data is in an RDBMS, most RDBMS products are
> pretty good at caching in memory lots of data already
> (just as they're very good at taking advantage of
> multicore CPUs in querying that data, which is a topic for
> another discussion).
>
they can all cache the data, but, the design still optimized for disk IO and the data structures in memory don't necessarily compare well with what a concurrent hashmap with objects can provide.

Just take Oracle for instance. Isn't it still true that you make you make atleast 2 hops before you can get to the data sitting in the SGA (shared memory)?
Cameron
Posts: 26 / Nickname: cpurdy / Registered: December 23, 2004 0:16 AM
Re: From JavaOne 2009: Load-Testing Clustered Applications
June 17, 2009 7:17 PM      
Hi Frank,

> I would be interested in how others handle heavily
> I/O-bound jobs in a clustered environment: What if most
> of your queries, for instance, result in cache misses and
> need to hit the disk?

I'd love to give you a simple answer, but the truth is that there are so many different possible answers depending on the scenario. For example, for applications that don't have to show transactionally up-to-date data, it's far better to cache common query result sets (something you can easily do without any commercial product and without any clustering).

Products like Oracle Coherence or Terracotta come into play when you have to coordinate among multiple servers, e.g. app servers. Coherence allows the app servers to talk directly to each other (peer to peer) in order to coordinate very efficiently. In Terracotta's case, each app server establishes a client/server connection to a separate "Terracotta server" (which uses Oracle Berkeley DB for its disk storage and I/O ;-), and thus those app servers all have to do the coordination through that central server.

> Also, in this interview, Ari's example was a 100GB
> database.

Depending on your budget, a 100GB database is easily cacheable. On the other hand, depending on your load, it may run fine without much caching (or tuning ;-) at all ....

> In another Artima interview, Gil Tene from Azul suggests
> that you load your 100GB dataset into the memory of a
> single JVM, which can be accomplished using their special-
> purpose JVM.

Yes, it's pretty interesting technology, and there are certain applications which basically require Azul because they have very large heaps and cannot tolerate long GCs.

> 1. How would you approach this problem if you had a TB
> dataset? And if you had petabyte of data?

Terabyte data sets can fit in memory using data grid technology (e.g. Coherence), but that will likely be an expensive choice (about 35 servers, including full redundancy), so you'd only do that if the benefits of having a terabyte in-memory data grid were worth it. Banks use this type of architecture for systems that have lots of data and have to provide fast results (e.g. risk calcs, end of day processing, trading systems, etc.)

There are many other ways to deal with TB data sets without putting it all into RAM, but obviously I spend most of my time helping customers put everything in RAM, so I'm hardly the database expert. The real question is, does the application have a data access pattern that is so scattered and so heavy that it all has to be in RAM? If not, then a database (likely coupled with a cache in front of the database) should likely be able to handle the load fairly easily.

I'll have to skip the petabyte question, because it's simply outside of my area of expertise. While I've worked with really big databases before, they often used different approaches for OLTP (e.g. Oracle) versus OLAP, and it seemed like there was still quite a bit of technical specialization and point products used to make these systems work well.

> Loading an entire dataset in RAM doesn't seem to take
> advantage of memory hierarchies, but simply pushes the
> problem into the most expensive type of memory. What
> would be the role of flash memory, for instance, in
> this sort of application (which is cheaper than RAM)?

Flash memory seems to be an effective way to speed up databases, that's for sure. I'm not sure why it isn't more widely used at this point ....

> 2. If the data is in an RDBMS, most RDBMS products are
> pretty good at caching in memory lots of data already

Yes, but that's far different than the concept of a distributed cache. In a distributed cache, for example, when you ask for an object, you probably get it 99.9% of the time locally, and only have to go over the network .1% of the time to ask for it, and then you have it. With a database cache, when you ask for the same object, it goes to an ORM, which issues a query or three, which goes to the database, which pulls data from cache, which sends it back to the ORM, which glues together objects from it. The number of steps that simply don't exist in an application that makes effective use of distributed caching is mind-boggling, so even if the database caches everything and the cache has absolutely zero latency, you will often still lose (compared to a distributed cache) if you have to go to the database.

> Currently, what are the rules of thumb in terms of when
> to consider caching data external to the database (what
> datasize), etc)?

It's not about size, it's about whether the application can cache the data at all. If the application isn't able to cache it for security, transactional, or whatever reason, then the rule of thumb is to not cache ;-)

If the application can cache, then it should, because (just like layered caches in hardware) the closer you can cache the data to the processing (where the data is being used), the better the performance and scalability.

Regarding size, it really depends on the use case. Generic caching often begins to show a benefit with small cache sizes, and increases its effectiveness as the cache size grows up to the size of the entire data set (typically with diminishing returns, of course). It's a cost/benefit analysis at that point.

On the other hand, some use cases require the entire data set to be in memory in order to work -- things like analytics or querying from the cache. In these cases, the question is whether a cache-based architecture makes sense at all (as described above).

Note that since I work in a relatively small, high-end niche (data grids) with customers that all need to cache basically everything in RAM, my sample set is badly skewed.

I hope that helps to answer your questions, although I realize that it's not a cut and dry answer. IMHO, anyone who tries to pitch a one-size-fits-all solution for this problem domain is either an idiot or a liar, and there are at least a few candidates ;-)

Peace,

Cameron Purdy } Oracle Coherence
http://coherence.oracle.com/
5 posts on 1 page.
« Previous 1 Next »