In a recent blog entry, Ted Dziuba argues that in most Web applications that access a database, the HTTP protocol's inherent scalability is lost when a request must interact with a database. The solution he outlines to extending scalability to the database layer is the one Google employs in its user account management system.
While the stateless HTTP protocol facilitates the horizontal scaling of Web applications, much of the benefits of a stateless application are lost when that application must enter a database. In a recent blog entry,Web Apps: Why does Statelessness Stop at the Database?, Ted Dziuba characterizes the problem as follows:
As a user query moves up the stack of services, [the database] is where the statelessness stops. If a configuration has a few FastCGI nodes, each with a few processes, then each process maintains its own connection to the database. Not only is one database server handling all of the load, but it is also a single point of failure. Losing a FastCGI machine is not that big of a deal, but losing your SQL server means lights out.
He suggests that you can architect your database layer to be "stateless," too, which is what Dziuba's employer, Google, has done with their user account management:
You can load balance [your database] and set up some master/slave replication, but it makes more sense to keep the whole stack stateless. How is this done? With a stateless database of course.
At Google, we use a stateless database to run our user accounts authentication system. Namely, it’s BerkeleyDB HA. Many different products depend on Google Accounts: GMail, Google Groups, Personalized Home, Orkut, Writely, and anything else that requires a user to log in (the list goes on). The statelessness of BerkeleyDB HA lets Google Accounts achieve remarkably low latency, even under heavy load. As you could probably guess from the name, it provides high availability as well...
When we design tables, the typical mapping is that each table represents[an] object, with foreign keys to other objects. Stateless databases can be shard similarly. The natural division is to keep replicas of different tables on different backends. This way, you won’t have a copy of the entire database on each app server, but each app server can get the objects it needs easily. Furthermore, you can add more replicas for objects that get hit frequently, further reducing latency....
Each application server will have a replica the data, but if you can shard your data by object type as presented in the previous section, space won’t be an issue except in extreme cases.
The article is misleading. This has nothing to do with Statelessness. If no state was necessary we would not need a database to begin with.
The issue discussed here is that in a web stack with current technology, everything but the database can be trivially parallelized by adding more cheap machines. What we need is a cheap and efficient shared-nothing parallel database. Google is using a custom solution from BerkleyDB HA, but I am sure there are other ways to approach that. Anybody know how well Oracle’s solution perform if I wanted to distribute a database on say 200 cheap PCs for instance?
It's blog entries like this that make my life miserable. I end up in a meeting with operations or management and they just can't fathom why they can't run their booking engine in multiple datacenters...now. "But Google does it, I read this blog post, they say it's super easy", the unspoken implication being that I must suck and if they just had some of those Google geniuses instead of me, their life would be sweet. There have always been techniques for scaling beyond a single database, they inevitably incur some level additional complexity (additional code is complexity, additional configuration is complexity, additional products is complexity) in the tiers above the database and more often than not, relaxing ACID and opening up yourself to potential collisions. A deep understanding of the nature of your data and the operations required is the key to scaling your database. If you have that, then you can make intelligent decisions about what path to take, be it partitioning, replication, caching or clustering (Oracle claims to have true clustering, but I've heard conflicting reports on the reality.) Yes there are some technologies that can make it more or less easy, but make no mistake, you are *always* making some tradeoff.
> Or even better, the ability to avoid going to a database > altogether for 99% of cases .. what if what you needed > were already in memory in the form you needed it in? ;-) >
Sure caching will get you long ways and is what should be done first. But I don’t know if caching works so well with information that is very personalized. Password, personalized layouts, shopping cart contents etc.
> Sure caching will get you long ways and is what should be > done first. But I don’t know if caching works so well with > information that is very personalized. Password, > personalized layouts, shopping cart contents etc.
I wasn't only referring to caching. Many of our customers use main-memory architectures for transactional operational data (ODS).
Personalization is actually one of the "big wins" for this type of architecture, since the usage patterns tend to be very tight (e.g. high affinity within a distributed environment, highly coupled access patterns, etc.)
> Yes there are some technologies that can make > it more or less easy, but make no mistake, you are > *always* making some tradeoff.
Absolutely, and a bit more realism in our industry (e.g. "the grass is always greener in the newest framework" and "people who don't work for us must be smarter") would do us well.
Complexity, to your point, is probably the greatest threat to the growth of our industry. We are getting to the point as an industry in which a large number of our aggregate cycles are spent dealing with entropy. I think that this has happened for two reasons: First, the growth of the industry has slowed and perhaps reversed (at least in the west), which means that the total number of cycles is not growing and may be shrinking. Second, we failed to target entropy and manage its growth, so the investments made through the recent dot-com boom (etc.) are requiring many more cycles on average to maintain than the systems that came before them.
There are bright points of hope as well, such as the reduction in the number of applications that a typical client has to install / maintain, thanks to advancements in "rich client" web technologies, self-updating applications, cross-platform (e.g. Java) technology, install-on-demand extensions, etc.