This article is sponsored by the Java Community Process.
Upcoming Features in JDBC 4
by Frank Sommers
September 2, 2005

Summary
JDBC 4 is the forthcoming release of the Java Database Connectivity API. Currently in Early Draft Review in the JCP (JSR 221), JDBC 4 is a major new release with a strong focus on ease-of-use and programmer productivity. The new JDBC version also introduces support for SQL 2003 data types, including SQL's native XML type. This article surveys the key JDBC 4 features.

A large portion of data recorded on magnetic media is stored in some form of a relational database. When the Java Database Connectivity API (JDBC) debuted in 1997, it made much of that data available to any Java application. JDBC presented data stores in an object-oriented API that was easy to learn and use, and it became an immediate success.

Version 4 of the JDCB API is currently in Early Draft Review in the JCP (JSR 221)[1]. While prior JDBC versions focused on incremental improvements providing functionality to J2EE application servers, JDBC 4's key priority is to make it easier for developers to work directly with the JDBC API. Thus, the latest JDBC spec is a return to the API's roots, which was to provide a set of simple abstractions for developers who needed to work with relational databases. Much of JDBC 4's ease-of-development features are a direct result of new capabilities available in J2SE 1.5.

In addition to ease-of-use, JDBC 4 introduces several SQL 2003 features, including a Java-specific mapping for the new XML SQL type. The new standard also defines support for national character sets, and aims to enable JDBC implementations to perform better by exposing to tools more information about the state of database connections and statements. Once approved as final standard, JDBC 4 will become a standard part of J2SE 1.6, and will supersede JDBC 3. This article reviews those JDBC 4 features most likely to benefit developers.

Better Pools

While JDBC did more than any other API to turn Java into the leading enterprise platform, J2EE emerged in part to make working with enterprise data more palatable to developers than what was possible with JDBC alone. Indeed, JDBC's designers never intended their API to serve as the sole bridge to enterprise databases from Java programs, but rather as a call-level interface (CLI), on par with the X/Open standard, that other, higher-level APIs would build on.

There are two key areas of JDBC complexity that J2EE vendors have invested into solving for their customers: performance and ease of use. Because JDBC data access occurs through the Connection interface, optimizing the performance of that Connection and its associated artifacts is key to higher JDBC performance. A common optimization technique reuses database connections and objects representing SQL query or update statements. For such connection and statement pools to work, JDBC had to evolve and standardize how a runtime infrastructure interacts with connections and statements.

While JDBC provided abstractions for both connection and statement pooling, those abstractions operated on a fairly coarse level. For instance, JDBC 3 connection pools merely substituted a pool of connections for a single connection. Individual connections inside the pool were hard to manage, especially if connections inside the pool became stale over time. That gave rise to a situation where the entire database appeared unresponsive but, in reality, the database server had no trouble, only the connection pool may have run out of resources due to stale connections. The only solution was to reinitialize the pool, sometimes by restarting the database server.

Because a stale connection does not necessarily mean a closed connection (which might be garbage collected), connections that became unusable often took up those valuable connection resources. A new method in JDBC 4's Connection class, isValid(), allows a client to query the database driver if a connection is still valid. This allows a more intelligent management of connection pools by clients.

While one database connection is indistinguishable from another, the same does not hold for statements: Some SQL statement are more frequently used than others in any application domain. Prior to JDBC 4, there was no way to tell such statements apart in a statement pooling environment. While every JDBC 4 PreparedStatement is poolable by default, the new poolable property allows an application to offer a hint to the statement pooler as to whether a statement should be pooled. This allows an application developer, or a development tool, to better manage statement-pooling resources, offering pooling preference to frequently accessed statements.

Because a database is a shared resource, many Java applications require concurrent access to the same data store. As each of those applications connect to the database and execute statements on a connection, connections from some applications might take up more resources than others, sometimes unfairly bogging down database performance. JDBC 4 allows an application to associate metadata with a database connection via the setClientInfo() method, which consumes a name/string value pair, or a Properties object. Diagnostic tools can subsequently retrieve the name and characteristics of applications causing database problems.

Connection to a database requires that a suitable JDBC database driver be loaded in the client's VM. In the early days of JDBC, it was common to load a suitable driver via Class.forName(), passing in the name of the class implementing the JDBC Driver interface. The DriverManager class later offered a more flexible means of managing JDBC drivers in a client application. For a driver to become available, DriverManager's registerDriver() had to be invoked with the driver's class name. Alternatively, you could specify the drivers to load via the jdbc.drivers system property. When DriverManager initializes, it attempts to load the drivers associated with that property.

JDBC 4 adds the J2SE Service Provider mechanism as another means to specify database drivers. For this to work, driver JAR files must include the file META-INF/services/java.sql.driver. That file must contain a single line with the name of the JDBC driver's implementation of the Driver interface. Invoking getConnection() on DriverManager will load a driver so packaged, if needed. Upon loading the driver, an instance of the driver is created, and then registerDriver() is invoked to make that driver available to clients.

SQL 2003 support

After a less-than-enthusiastic reception of object-relational database features in the late 90s, the relational database research community has focused on accommodating new data types used in recent applications, and on making access to vast data stores faster and more convenient. The result of that effort is SQL 2003, the latest SQL standard [2]. SQL 2003 is not simply an incremental upgrade to the database standard, but also introduces many new features in response to database users' needs.

JDBC 4 provides support for those SQL 2003 features most likely to find their way into commercial products in the near future. In addition to the XML data types discussed in the next section, JDBC introduces support for ROWID, a data type that had been in use in database products even before it became part of the SQL standard. The best way to understand ROWID is as a reference, or address, unique for each table row. For example, you can use a ROWID as a parameter in a PreparedStatement:

Connection c = myDataSource.getConnection();
PreparedStatement st = c.prepareStatement("insert into siteusers (userid, username) values 
(?, ?);
st.setRowId(1, rowId1);

The above code would set the user's userid to the value referred to by the rowId1 ROWID, which you could have obtained from a previous select statement. The ResultSet and PreparedStatement classes were updated in JDBC 4 to handle ROWIDs.

In addition to ROWID, JDBC 4 provides mappings for the following SQL 2003 data types:

SQL 2004 type JDBC 4 mapping
BLOB java.sql.Blob
CLOB java.sql.Clob
NCLOB (Character Large Object using the National Character Set) java.sql.NClob
ARRAY java.sql.Array
Structured types java.sql.Struct
REF (reference to SQL structured type in the database) java.sql.Ref
DISTINCT Type to which base type is mapped, e. g., Based on SQL's NUMERIC type maps to java.math.BigDecimal.
DATALINK java.net.URL

As databases become increasingly capable of storing large amounts of binary and textual data, JDBC 4 offers extended support for the CLOB (character large object) and BLOB (binary large object) SQL data types. In order to make it easier to work with large binary or text objects that could be inserted into a database, the Connection, Statement and ResultSet interfaces in JDBC 4 offer methods for the creation, insertion, and retrieval of CLOBs and BLOBs, respectively.

When creating a new BLOB or CLOB object through Connection's appropriate create() method, the resulting object does not contain the actual binary or character data. Rather, you would add the "real" data to the CLOB or BLOB objects:

Connection c = myDataSource.getConnection();
Blob myBlob = c.createBlob();
OutputStream outStream = myBlob.setBinaryStream(0);

Having obtained the BLOB's OutputStream, you can write your actual data, such as an image or a binary document, to that stream. The parameter to getBinaryStream() denotes the byte offset where the new OutputStream starts writing its data. Once you've created and populated a BLOB or CLOB, you can use those objects in PreparedStatements as well:

PreparedStatement st = 
   c.prepareStatement("update sitesusers set user_photo=? where userid=?");
st.setBlob(1, myBlob);

XML as a First-Class SQL Data Type

Perhaps the most significant enhancement in SQL 2003 is support for XML as a native database data type [3]. Many database products have offered native XML support prior to the emergence of SQL 2003, and in many ways the standard is a codification of common practice. But until JDBC 4, Java clients could access XML data elements only as BLOBs, CLOBs, or even TEXT types.

The best way to understand SQLXML, the new JDBC 4 mapping of the SQL XML type, is with an example. Suppose you store a user's blog posts in an XML document format, and you'd like to iterate over those posts and display them in a Web page. The following database table would hold this information:

create table user_has_blog(userid int, blog_entry xml);

Using JDBC 4's SQLXML data type, you could retrieve a user's blog entries as follows:

Connection c = myDataSource.getConnection();
PreparedStatement st = c.prepareStatement("select userid, blog_entry from user_has_blog");
ResultSet rs = st.executeQuery();
while (rs.next()) {
        SQLXML blog = st.getSQLXML("blog_entry");
        javax.xml.stream.XMLStreamReader reader = blog.createXMLStreamReader();
         //read data here
         blog.free();
}

SQLXML's createXMLStreamReader() returns a StAX stream reader, which is a low-level interface allowing access to an XML stream [4]. The Streaming API for XML (StAX) is currently being developed as JSR 173 [5]. See the reference at the end of this article about using StAX to read and write XML data. Whenever you've exhausted the XML stream, you need to invoke close() on SQLXML to free resources associated with the XML data stream.

While not strictly related to JDBC 4, databases that support SQL 2003's XML extensions also allow you to employ the new XMLELEMENT operator. XMLELEMENT is used to create an XML element. The first required argument to XMLELEMENT is the name of the element to be constructed, specified as an SQL NAME. An optional second argument can specify attributes for the newly created element. Arguments following that make up the content of the XML element. To illustrates the XMLELEMENT operator, consider the following table:

create table user (int userid, firstname varchar(128), lastname varchar(128))

The following SQL selects all users with their first and last names into XML User elements:

select e.userid, 
    XMLELEMENT(NAME 
      "user", e.firstname || e.lastname) as "result" from employee e;

The result of that query will be a two-column relation:

userid result
1 <user>George Washington</user>
2 <user>Thomas Jefferson</user>

Creating and inserting into the database a new SQLXML value is similar to working with BLOBs. You invoke Connection's createSQLXML() method, and then populate the resulting SQLXML's input stream with the XML content:

Connection c = myDataSource.getConnection();
PreparedStatement st = c.prepareStatement("insert into user_has_blog (userid, blog_entry) values (?, ?)");
SQLXML blogEntry = c.createSQLXML();
Writer writer = blogEntry.createXMLSteamWriter();
//write XML content to writer
st.setInt(1, 1); //User id
st.setSQLXML(2, blogEntry); //User's blog
//Commit transaction

Exceptional exceptions

Another area JDBC 4 addresses is error handling, one of the more complex chores when working with databases. Because databases are often remotely accessible resources, problems such as network failures, or the inability of the database management system itself to process a request, can cause exceptions when executing a database operation. In addition, incorrect SQL statements can cause exceptions.

Prior to JDBC 4, most JDBC operations generated a simple SQLException. While that exception contained some information about what went wrong, few developers bothered to use that information. Instead, when encountering an SQLException, most JDBC code tends to abort the transaction, and indicate the error to the caller. JDBC 4 introduces a finer-grained exception hierarchy via both chained exceptions and by dividing exceptions into transient and non-transient categories.

A transient SQL exception extends SQLTransientException, and indicates that a previously failed operation might be able to succeed if the operation is simply retried. For instance, failure to connect to a database is a transient exception, since retrying the connection may cause it to work a second time around. A non-transient exception extends SQLNonTransientException, and signals that retrying the same operation would fail again unless the root cause of the problem is remedied by the user. For instance, an erroneous SQL expression would be a non-transient exception. Both SQLTransientException and SQLNonTransientException are subclasses of SQLException.

JDBC 4 maps the various transient and non-transient exception types to an SQLState values. SQLState is one of the values in an SQLException and corresponds to error codes defined in the SQL standard. When the SQL standard does not define an error state for an exception condition, the default in JDBC 4 is to return an SQLException.

SQLTransientExceptions
SQLTransientConnectionException Indicate some change in the connection's communication connection layer. SQLState 08
SQLTimeoutException When timeout specified by a Statement is expired. No corresponding SQLState.
SQLTransactionRollbackException Indicate that the current Statement was automatically rolled back by the database. This caused by a transaction deadlock or serialization problems. SQLState 40
SQLNonTransientExceptions
SQLDataException Indicates data errors, such as invalid arguments to functions, etc. SQLState 22
SQLIntegrityConstraintViolationException A foreign key, primary key, or unique key constraint was violated. SQLState 23
SQLInvalidAuthorizationSpecException Authorization credentials presented during authorization were not valid. SQLState 28
SQLNonTransientConnectionException A non-transient version of the connection exception, indicating some change in the connection's communications layer. SQLState 08
SQLSyntaxErrorException Indicates that an in-progress query has violated syntax rules. SQLState 42

JDBC 4's exceptions are also chained exceptions, and accessing the root cause of an exception can quickly help determine the source of a problem. To support iteration over an exception chain, SQLException defines a new getNextException() method. Invoking this method returns either the next Exception in the exception chain, or null, when the root of the hierarchy is reached. SQLException also supports the getCause() method, which may return a non-SQLException subtype as well (since the cause of an SQLException may be a non SQL-related problem). The following code example illustrates iterating through an SQLException chain:

...
 } 
catch (SQLException e) {
 
while (ex != null) {
        Throwable t = ex.getCause();
        while (t != null) {
                t = t.getCause();
        }
        ex = ex.getNextException();
}

Since JDBC 4 supports the JDK 1.5 for-each construct, the above code can be written as follows:

} 
catch(SQLException e) {
        for (Throwable ex : e) {
                System.out.println("Exception encountered: ...")
        }
}

Type-safe querying and results

Every developer working with JDBC realizes at some point that the code to translate between the relational world of an SQL database and the object-oriented paradigm of a Java application can quickly start to dominate an enterprise Java project. Prior to version 4, the JDBC API offered little help in bridging those worlds. Instead, the EJB entity persistence model and, more recently, object-relational mapping standards and products, such as JDO and Hibernate, provided developers alternative ways to access relational data stores. While these APIs use JDBC under the covers, they hide JDBC from the developer. Instead, they present database records in terms of an application's domain model.

However, introducing O/R mapping into an application just to make it easier to work with relational databases incurs additional complexity that could be avoided by an improved JDBC API. JDBC 4 brings just such an improvement to the development experience with the introduction of annotations and generic types, both new features in JDK 1.5.

To illustrate JDBC's use of annotations, consider a simple User entity with these properties:

class User {
        int userID;
        String name;
        String department;
}

This class may be persisted in a database table created with the following DDL (data definition language):

create table user (int userid, name varchar(128), department varchar(128));

A common "feature" of many JDBC applications is a static class containing the application's SQL statements. For instance, an application accessing a database of users may have a "template" class containing the following query:

class MyQueries {
        public static final String SELECT_ALL_USERS = "select * from user";
}

While factoring all JDBC query statements to a separate class helps reduce clutter, this solution is not very object-oriented. At the same time, interacting with SQL data stores necessarily involves managing strings that represent SQL statements.

Annotations help bridge the gap between objects and strings. JDBC 4 defines a Query interface and associated annotations that lend the above development style a more object-oriented flavor. Query defines methods that are decorated with JDBC annotations corresponding to SQL queries and update statements.

When you invoke a method in a Query, the SQL statement associated with that Query method will be executed, and the results returned and bound to a data type you specify. Consider an annotations-based version of the above query template:

interface MyQueries extends BaseQuery {

        @Query(sql="select * from user")
        DataSet<User> getAllUsers();
}

This Query implementation is a subinterface of BaseQuery. The getAllUsers() method is bound by the @Query annotation to the SQL query statement. When the method is invoked, the runtime system executes the associated query string with the method, and returns the results as a DataSet.

A DataSet is a subinterface of java.util.List, and provides a type-safe view of the data returned from an SQL query. DataSet is a parameterized type: The parameter type is a class that describes the data returned from the query. DataSet acts a bit like a type-safe ResultSet, but it supports both connected and disconnected access to data.

Once a DataSet is returned from a query, you can iterate through its elements. The bit of magic in turning the Query interface into an actual JDBC query, executing that query, and then binding the resulting data to the object type specified as a parameter to DataSet, is accomplished by a set of runtime classes.

In this example, you could obtain an instance of MyQuery from the Connection class:

Connection c = myDataSource.getConnection();
MyQueries myQueries = c.createQueryObject(MyQuery.class);
DataSet<User> users = myQueries.getAllUsers();
for (User u: users) {
        System.out.println("User's name is: " + user.name;
}

You could also create a parameterized annotation element with Query:

interface MyQueries extends BaseQuery {

        @Query(sql="select * from user where department= {department}")
        DataSet<User> getDepartmentUsers(String department);
}

Parameters in the annotation must be specified with a string value inside {...} braces. The parameter's string value must match, in a case-sensitive way, the method parameter name corresponding to that parameter (department, in this case).

In addition, query annotations can contain not only queries, but also updates or deletes. For instance, to remove a user corresponding to a user name, you might use the following annotation:

interface MyQueries extends BaseQuery {

        @Update(sql="delete from user where name= {userName}")
        int deleteUser(String userName);
}

To update a user's department to a new value, you would use this annotation:

interface MyQueries extends BaseQuery {

        @Update(sql="update user set department={deparment} where name= {userName}")
        int updateDeparment(String userName, String department);
}

DataSet also allows you to modify or delete records in a database. For instance, having obtained a DataSet, you could insert a new user into the database via that DataSet:


Connection c = myDataSource.getConnection();
MyQueries q = c.createQueryObject(MyQueries.class);
DataSet<User> users = q.create();
User user = new User();
user.setUserID(1);
user.setName("Joe");
user.setDeparment("Accounting");
users.insert(user);

While such data binding at first appears similar to object-relational mapping tools, all that takes place here is that database columns are bound to an object representing the query results in a type-safe way. The focus remains on SQL, and JDBC 4 does not attempt to map between complex relational and object hierarchies.

Summary

JDBC 4 is currently in early draft review. If you care about the future of Java database access, this is your chance to download the spec and, if you find something in error, to comment during the public comment period. Such comments might actually be heard and appreciated, especially by the scores of developers relying on this upcoming version of the Java data access

Resources

[1] JDBC 4 API Specification, JSR 221
http://www.jcp.org/en/jsr/detail?id=221

[2] Eisenberg, A., K. Kulkarni, J. Melton, J. Michels, and F. Zemke. SQL:2003 Has Been Published In ACM SIGMOD Record, Vol. 33, No. 1, March 2004.
http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf

[3] Eisenberg, A., and J. Melton. SQL/XML is Making Good Progress In ACM SIGMOD Record, Vol. 32, No. 2, June 2002.
http://www.sigmod.org/sigmod/record/issues/0206/standard.pdf

[4] StAX XMLStreamReader JavaDoc
https://stax-utils.dev.java.net/nonav/javadoc/api/javax/xml/stream/XMLStreamReader.html

[5] Streaming API for XML (StAX), JSR 173
http://www.jcp.org/en/jsr/detail?id=173
http://dev2dev.bea.com/xml/stax.html

[See also] JDBC 4 Presentation (JavaOne 2005)
https://jdk.dev.java.net/nonav/J12005/jdbc.pdf

Talk back!

Have an opinion? Readers have already posted 8 comments about this article. Why not add yours?

About the author

Frank Sommers is an editor with Artima Developer. He is also founder and president of Autospaces, Inc., a company providing collaboration and workflow tools in the financial services industry.