CouchDB is a new, open-source storage engine for document-oriented applications. It provides a schema-free approach to storing objects that can be described as a collection of name/value tuples, and provides a REST/JSON interface, replication, indexing, and querying. Jacob Kaplan-Moss reviews CouchDB in a recent blog post.
The project's documentation describes CouchDB as:
Unlike SQL databases which are designed to store and report on highly structured, interrelated data, CouchDB is designed to store and report on large amounts of semi-structured, document oriented data. CouchDB greatly simplifies the development of document oriented applications, which make up the bulk of collaborative web applications...
In an SQL database, as needs evolve the schema and storage of the existing data must be updated. This often causes problems as new needs arise that simply weren't anticipated in the initial database designs, and makes distributed "upgrades" a problem for every host that needs to go through a schema update...
With CouchDB, no schema is enforced, so new document types with new meaning can be safely added alongside the old. The view engine is designed to easily handle new document types and disparate but similar documents.
What makes CouchDB especially suitable for Web applications is its built-in replication mechanism:
CouchDB is built from the start with a consistent vision of a distributed document database system. Unlike cumbersome attempts to bolt distributed features on top of the same legacy models and databases, it is the result of careful ground-up design, engineering and integration. The document, view, security and replication models, the special purpose query language, the efficient and robust disk layout are all carefully integrated for a reliable and efficient system.
In a recent blog post, Jacob Kaplan-Moss, a lead developer of Django, tests CouchDB, noting that:
Anything I can poke at with curl is pretty damn cool... Wow, I can just chuck arbitrary JSON objects up at this thing and it’ll store it. No setup, no schemas, no nothing. This is relaxing…
I hadn’t expected CouchDB to look this polished so soon in the game. The web interface is truly awesome, and naturally implemented directly against the regular API.
Nice, there’s already a couchdb-python (built with httplib2, natch). The latest release installed with easy_install doesn’t seem to work, but SVN trunk does... I wonder what it would take to make CouchDB into a backend for Django models? Seems there’s a much lower impedance mismatch between a document database and an object one — a model instance maps much better to a document than to a tuple.
What do you think of CouchDB and its schema-free approach to document storage?
> What do you think of CouchDB and its schema-free approach to document storage?
It's an off-the-shelf weakly structured data store for storing weakly structured data. There's nothing pretentious about that. However, "document database server" is a bit too generic. Some documents are highly structured, such as a Sharps Safe Disposables form a hospital might use to assure sharps are safely disposed of. You wouldn't use CouchDB to store that sort of "document".
Schema-free. Ugh - the data modeller in me gets the hee-bee jee-bees. Give me structure, give me constraints! But then the world is a messy place, full of fuzzy data, so supporting ad-hoc and arbitrary data and the applications you could build from it also get me quite excited.
At first I thought perhaps folks were getting over excited about CouchDb because they were drinking too much JSON kool-aid. But the more I played with CouchDb, the more excited I got, until I had no choice but to write my own CouchDb blog posting ...
@Kevin Teague @The fact that you can build rich internet applications using CouchDb as a data store means that you don't even need an application server between you and your database - very interesting stuff indeed.
Interesting - but not a new concept. There's been a source forge project in existence for about 3 years (although only up on source forge for a little over a year now) that provides the same functionality for strongly structured data stores - that is why I made my comments above. Calling CouchDB "document database server" is a misnomer because some documents have business rules implicit to them - but as a data modeler you know that. CouchDB has no way to defend itself from data integrity violations. Both CouchDB and Andromeda address different needs, though.