Sponsored Link •
Software is become an increasingly important consumer of Web content. When software and humans are both intended targets of a Web-accessible resource, should you provide separate URLs for that resource based on a client's preferred content type, or a single URL and rely on server-based content negotiation to serve up the appropriate content?
RSS feeds, search engines, REST resources, and Web services are all examples of software, or machines, becoming increasingly important consumers of Web-accessible content. A recent set of IBM developerWorks articles by Lee Feingebaum and Elias Torres, A meaningful Web for humans and machines, discusses the notion of a parallel Web: Web content aimed both for machine and human consumption (Part I: How humans can share the wealth of the Web and Part II: Explore the parallel Web).
Thinking about the notion of a parallel Web is especially timely with the release of Rails 1.2. The key new features in Rails 1.2 is support for RESTful Web resources and, along with that, the ability for the Rails framework to return different content for a URL, depending on a client's preferences.
For instance, accessing a URL via a Web browser, the client of the Rails controller servicing that URL request will likely prefer HTML output. Hence the Rails controller will locate and render a template for HTML output. An RSS reader would likely prefer an RSS or Atom-formatted output, and the Rails controller will then render a template producing that sort of output, if such a template is available. And a REST client will likely prefer XML-based output—the same controller method can serve that up as well.
The way a Rails controller decides what types of content to serve is via a hidden
format tag embedded in the request. Thus, appending
?format=xml to a plain HTML page backed by a more versatile controller method can cause an XML response to be retrieved.
Feingebaum and Torres discuss two other methods that can achieve a similar goal of serving up different content formats based on a client's preferences: HTTP content negotiation and the
link HTML tag:
Content negotiation is available through the HTTP protocol, [and is] the mechanism that allows user agents and proxies/gateways on the Internet to exchange hypermedia. This technique might be mapped mostly to a scenario where alternate representations are found at the same Web address. In HTML pages, the link element indicates a separate location containing an alternate representation of the page.
While content negotiation has been available since HTTP 1.0, it is still seldom used, according to the authors. In fact, Rails has provided support for content negotiation since the 1.1 release, which is now supplanted by the use of the hidden
format parameter. The
link tag is used more often on the Web, according to the IBM authors, especially in providing RSS feed URLs for a Web page and in specifying CSS style sheets.
The relative merits of implementing content negotiation and alternate URLs are interesting, but more significant is that they represent two fundamentally different approaches to the parallel Web, forcing developers to make an important design decision: Whether to provide one URL for the same resource, or whether to provide separate URLs based on the type of response served.
My personal experience implementing a RESTful Rails app and following the one-URL approach, is that the one-URL approach, while at first appearing to simplify design, can, in reality, lead to more complex controller code. The reason has not much to do with response formats, but rather with the reality that clients consuming those different format types often make their requests under very different assumptions, and in a different contexts.
For example, in the Rails application I developed with the RESTful approach, I decided to take the XML-based representation to a real-world test-drive by providing a Java applet as an alternate view to a resource. Processing the XML request and response were easy with Java, using the excellent Apache HttpClient. Requests made by the applet, however, were made on their own HTTP connection to the server, of course—so session state associated with the browser's HTTP session was not available to the applet. That required additional controller code to match up the applet's context with that of the browser's. Accounting for the different contexts from which clients were making requests led at first to ugly
if-then branching in the controller method, which I then had to refactor to make more testable.
One could argue that REST is supposed to not rely on server-side state at all. But many browser-based applications do. So if the same controller method serves requests from regular browsers as well as from external applications, such as an applet or an application, the controller method would eschew many benefits of state that make Web applications easy to develop (such as using a session to determine if a user is logged in).
I'm still on the lower slopes of learning the true use patterns of REST and Rails, so I may be missing some obvious wisdom here. But, at this point, it seems to me that providing separate URLs, and possibly controller methods, for requests requiring different types of content leads to simpler controller code.
What are your preferences for the parallel Web: Serve up different types of content at the same URL, or provide different URLs based on the content type of the response?
|Frank Sommers is a Senior Editor with Artima Developer. Prior to joining Artima, Frank wrote the Jiniology and Web services columns for JavaWorld. Frank also serves as chief editor of the Web zine ClusterComputing.org, the IEEE Technical Committee on Scalable Computing's newsletter. Prior to that, he edited the Newsletter of the IEEE Task Force on Cluster Computing. Frank is also founder and president of Autospaces, a company dedicated to bringing service-oriented computing to the automotive software market.
Prior to Autospaces, Frank was vice president of technology and chief software architect at a Los Angeles system integration firm. In that capacity, he designed and developed that company's two main products: A financial underwriting system, and an insurance claims management expert system. Before assuming that position, he was a research fellow at the Center for Multiethnic and Transnational Studies at the University of Southern California, where he participated in a geographic information systems (GIS) project mapping the ethnic populations of the world and the diverse demography of southern California. Frank's interests include parallel and distributed computing, data management, programming languages, cluster and grid computing, and the theoretic foundations of computation. He is a member of the ACM and IEEE, and the American Musicological Society.