Java Community News - Ted Dziuba on Why Statelessness Stops at the Database

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Community News
Ted Dziuba on Why Statelessness Stops at the Database

7 replies on 1 page. Most recent reply: Oct 9, 2006 7:24 AM by nes

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 7 replies on 1 page

Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

Ted Dziuba on Why Statelessness Stops at the Database

Posted: Oct 3, 2006 1:44 PM

Summary
In a recent blog entry, Ted Dziuba argues that in most Web applications that access a database, the HTTP protocol's inherent scalability is lost when a request must interact with a database. The solution he outlines to extending scalability to the database layer is the one Google employs in its user account management system.

While the stateless HTTP protocol facilitates the horizontal scaling of Web applications, much of the benefits of a stateless application are lost when that application must enter a database. In a recent blog entry,Web Apps: Why does Statelessness Stop at the Database?, Ted Dziuba characterizes the problem as follows:

As a user query moves up the stack of services, [the database] is where the statelessness stops. If a configuration has a few FastCGI nodes, each with a few processes, then each process maintains its own connection to the database. Not only is one database server handling all of the load, but it is also a single point of failure. Losing a FastCGI machine is not that big of a deal, but losing your SQL server means lights out.

He suggests that you can architect your database layer to be "stateless," too, which is what Dziuba's employer, Google, has done with their user account management:

You can load balance [your database] and set up some master/slave replication, but it makes more sense to keep the whole stack stateless. How is this done? With a stateless database of course.

At Google, we use a stateless database to run our user accounts authentication system. Namely, it’s BerkeleyDB HA. Many different products depend on Google Accounts: GMail, Google Groups, Personalized Home, Orkut, Writely, and anything else that requires a user to log in (the list goes on). The statelessness of BerkeleyDB HA lets Google Accounts achieve remarkably low latency, even under heavy load. As you could probably guess from the name, it provides high availability as well...

When we design tables, the typical mapping is that each table represents[an] object, with foreign keys to other objects. Stateless databases can be shard similarly. The natural division is to keep replicas of different tables on different backends. This way, you won’t have a copy of the entire database on each app server, but each app server can get the objects it needs easily. Furthermore, you can add more replicas for objects that get hit frequently, further reducing latency....

Each application server will have a replica the data, but if you can shard your data by object type as presented in the previous section, space won’t be an issue except in extreme cases.

At the end of this post, Dziuba points to a paper describing Google's use of BerkeleyDB High Availability (PDF).

What do you think of Dziuba's point in making the database layer "stateless," too? What solutions do you use to scale your database layer?

nes

Posts: 137
Nickname: nn
Registered: Jul, 2004