The Artima Developer Community
Sponsored Link

Artima Developer Spotlight Forum
Exploring CouchDB

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

Exploring CouchDB Posted: Mar 31, 2009 8:11 PM
Reply to this message Reply
Advertisement

The need for document-centric databases has resulted in a host of new database management products, most centering around XML. Some new products are extensions to existing DBMSs, such as the full-fledged XQuery support in IBM's DB2 and Microsoft's SQL Server, or Oracle's Berkeley XMLDB.

In addition to XML, however, newer document formats are emerging, such as JSON, the (most recent) lingua franca of Web-based document exchange. One of JSON's advantages is that a JSON document can easily be evaluated to a JavaScript object, a powerful feature for Ajax applications.

The open-source Apache CouchDB project defines a native database for JSON objects. What's equally interesting is that CouchDB provides a REST interface to interact with its data, writes Joe Lennon in a recent IBM developerWorks article, Exploring CouchDB:

The term "Couch" is an acronym for "Cluster Of Unreliable Commodity Hardware," reflecting the goal of CouchDB being extremely scalable, offering high availability and reliability, even while running on hardware that is typically prone to failure. CouchDB was originally written in C++, but in April 2008, the project moved to the Erlang OTP platform for its emphasis on fault tolerance...

CouchDB is built on a powerful B-tree storage engine, which is responsible for keeping the data in CouchDB sorted and provides a mechanism for searching, inserting, and deleting in logarithmic amortized time. CouchDB uses this engine for all internal data, documents, and views.

Because CouchDB is designed to store JSON documents, CouchDB databases don't define a traditional relational schema:

A document-oriented database is ... made up of a series of self-contained documents. This means that all of the data for the document in question is stored in the document itself — not in a related table as it would be in a relational database. In fact, there are no tables, rows, columns or relationships in a document-oriented database at all. This means that they are schema-free; no strict schema needs to be defined in advance of actually using the database. If a document needs to add a new field, it can simply include that field, without adversely affecting other documents in the database. This also documents do not have to store empty data values for fields they do not have a value for.

Instead of relations and primary keys, document identity in CouchDB is defined with UUIDs:

CouchDB does not come with an auto-increment or sequence feature. Instead, it assigns a Universally Unique Identifier (UUID) to each and every document, making it almost impossible for another database to accidentally select the same unique identifier.

Perhaps the most interesting aspect of CouchDB is how it creates database views using the popular map-reduce algorithm:

Because of the schema-free manner in which the database is structured, CouchDB is dependent on the use of views to create arbitrary relationships between documents, and to provide aggregation and reporting features. The results of these views are computed using Map/Reduce, a model for processing and generating large data sets using distributed computing.

The Map/Reduce functions in CouchDB produce key/value pairs, allowing CouchDB to insert them into the B-tree engine, sorted by their keys. This allows for ultra-efficient lookups by key and enhances the performance of operations within the B-tree. In addition, this also means that the data can be partitioned over many nodes without interfering with the ability to query each node individually.

Lennon's article also describes CouchDB's MVCC concurrency model, document metadata, and versioning.

What do you think of CouchDB's document-centric model?

Topic: Dojo 1.3 Released Previous Topic   Next Topic Topic: Getting Dynamic Productivity in a Static Language

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use