This article is sponsored by the Java Community Process.
Until the Java Content Repository API, content repository vendors each provided their own access protocols and APIs. The JCR API defines a vendor-neutral way to access repositories from Java, promising to deliver easy access to repositories from Java applications, much the same way JDBC defines a unified API for accessing relational databases.
The JCR API is unique not only in its ambitious goal, but also because it was one of the first JCP JSRs developed as an open-source project from its inception. The specifications, the reference implementation, and the Technology Compatibility Kit (TCK) are all licensed under an Apache-style license, thanks to the licensing flexibility of JCP 2.6 .
"We realized from the beginning that to have a credible [Java] standard, you have to have a credible open-source implementation," says Day Software's Nuescheler, who leads the expert group. "As far as I know, we're the first JSR [to] license this way from the beginning. We found the flexibility of JCP 2.6 very helpful."
"We didn't want just a dead reference implementation, and thought of ways to leverage the open-source licensing and the open source community spirit around the JSR," adds Nuescheler. "We looked for a vehicle for developing [the] reference implementation. Apache is one of best open-source brands you can get, and is one of the most credible infrastructure providers, especially in the Java world. So we started to move the reference implementation into the [Apache] incubator project."
The resulting Apache Jackrabbit project provides the codebase for the JCR API's reference implementation and TCK. "Beyond that, [Jackrabbit] is Apache's content repository tool," says Nuescheler.
The Jackrabbit code base contains not only the JCR API reference implementation, but also a fully functional repository as well as several contributed libraries for tasks, such as accessing a remote repository via RMI. There is even a JDBC persistence manager to allow plugging in a relational database as a persistent store, and an object-relational mapping tool that allows Hibernate applications to use the repository.
In addition to simply storing repository objects, Jackrabbit's repository integrates with Apache Lucene to allow finding repository objects via textual search. Additional retrieval methods are via SQL queries, or direct, path-based queries using XPATH. The Jackrabbit repository can be installed either in a J2EE container, or run as a standalone application.
The JCR API defines a very simple hierarchical data model. That model consists of
Workspaces, and each workspace can contain one or more
Item is either a
Node or a
Property. While a
Node can have child nodes, a
Property cannot. Instead, a
Property is a name-value pair
that contains the "real" data items associated with a node. A workspace contains
one root node. Workspace items are related in a hierarchical fashion, with
properties being the leaves.
Each node is defined by a node type. A node's primary node type defines what names, properties, and child nodes a node may have. The set of node types in a repository is a major part of a repository's information model. Although not technically correct, intuitively, nodes are similar to classes. The portion of the information model that consists of node types is then analogous to the object model of an application (or even the schema of a database).
An interesting node type is the mixin type. While a node can have only one primary type, it can have any number of mixin types. Mixin types define additional properties for a node, and are somewhat similar to the ability of an object to implement multiple interfaces.
The Jackrabbit repository defines a handful of built-in node types. One of the more
interesting built-in mixin types is
mix:referenceable. A referencable node is
one that exposes a Universally Unique Identifier (UUID) property,
jcr:uuid, for the purpose of allowing
the node to be referenced by that ID from other nodes. When automatically assigned
to a referenceable node on creation, a UUID is unique within the workspace, across multiple workspaces,
and even across multiple repositories.
allow the Jackrabbit repository engine to enforce relationships between repository items.
Another mixin node type is
mix:versionable. All versionable nodes are also
referenceable, and it is through references that the repository keeps version histories.
When a new version of a versionable node is saved, the repository automatically adds that
new version to the node's version history. Thus, an application that uses Jackrabbit for
data access need not implement versioning code.
Once a node is saved in a workspace, that node can be referenced either via the repository hierarchy or directly via the node's UUID. Thus, a JCR API can support both hierarchical repositories, such as those based on an object graph, and data models that do not lend themselves to hierarchical structures, such as those more akin to a relational model. In addition to direct referencing, Jackrabbit supports node discovery based on node type, as well as SQL and XPATH searches. Such searches return iterators over the search results that allow not only sequential access, but also support "skipping" ahead.
The default Jackrabbit repository is based on the file system. However, Jackrabbit provides a JDBC persistence manager that relegates data storage to a relational database. As any JCR-compliant repository, Jackrabbit can be accessed through any protocol such as WebDAV or RMI. Examples for different repository access modes are included in the Jackrabbit source distribution.
This article is sponsored by the Java Community Process.