The Artima Developer Community
Sponsored Link

Ruby Code & Style
Modular Architectures with Ruby
by Jack Herrington
October 10, 2005

<<  Page 4 of 4

Advertisement

Adding More Biographic Detail

I also have a problem with the get_type method on the factory. I think that in the long run I’m going to want more biographical information on each module. For example, the author, the module version, the description, inputs, outputs, etc.

Perhaps the easiest way to add biographical information to each module would be with a YAML encoded constant string attached to each factory class. This is shown on the RDF module below:

class RDFParser < Parser
  def parse( xml )
    # Parse the XML up and return some known format
    return nil
  end
end

class RDFFactory < ParserFactory
INFO=<<INFO
type: RDF
author: Jack
description: An RDF parser
INFO

  def create()
    return RDFParser.new()
  end
end

I then add some code to the Parser base class that reads the YAML and implements not only get_type but also get_author, get_description and anything else I want:

require 'yaml'

class ParserFactory
  ...
  def get_info()
    return YAML.load( self.class::INFO )
  end

  def get_type()
    return get_info()['type']
  end
  
  def get_author()
    return get_info()['author']
  end

  def get_description()
    return get_info()['description']
  end
  ...
end

The code to get the constant from the subclass is pretty simple. The get_info method gets the class of the current object and gets the INFO method.

Getting the Job Done

Having gone through all of the effort to build a modular architecture that reads various feed formats, it only seems fitting to actually implement one of them.

First the test code needs to actually get some RSS data:

require "net/http"
require "parse_mods.rb"
require "REXML/Document"

ParserFactory.load( "mods" )

rssp = ParserFactory.parser_for( "RSS" );

items = []

Net::HTTP.start( 'rss.cnn.com' ) { |http|
  rss = http.get( '/rss/cnn_topstories.rss' )
  doc = REXML::Document.new( rss.body )
  items = rssp.parse( doc )
}

items.each { |i|
  print "#{i.title}\n";
  print "#{i.link}\n\n";
}

This code starts with loading the modules. The code then gets a parser for RSS. It loads the RSS from CNN and creates an REXML DOM model from it. That DOM model goes to the parser which creates an array of object structures that hold the title, link, and description.

The code for the real parser module is below:

require 'ostruct'

class RSSParser < Parser
  def parse( xml )
    items = []
    xml.each_element( '//item' ) { |item|
      link = ""
      description = ""
      title = ""
      item.each_element( 'link' ) { |l| link = l.text.to_s; }  
      item.each_element( 'description' ) { |l| description = l.text.to_s; }
      item.each_element( 'title' ) { |l| title = l.text.to_s; }  
     items << OpenStruct.new(
        :link => link,
        :description => description,
        :title => title )
    }

    return items
  end
end

class RSSFactory < ParserFactory
INFO=<<INFO
type: RSS
author: Jack
description: An RDF parser
INFO

  def create()
    return RSSParser.new()
  end
end

It’s pretty simple. The code first iterates through all of the item tags, then within each item tag it finds the link, title, and description tags. With each of these it creates an OpenStruct object (part of the standard Ruby installation) and adds it to an array of articles which it returns.

The output on the day I wrote this article looks like this:

% ruby test.rb
Pumps begin draining New Orleans
http://www.cnn.com/rssclick/2005/US/09/05/katrina.impact/index.html?section=cnn_topstories

Violence rages in Iraq hotspots
http://www.cnn.com/rssclick/2005/WORLD/meast/09/05/iraq.main/index.html?section=cnn_topstories

Rehnquist to lie in repose at Supreme Court
http://www.cnn.com/rssclick/2005/POLITICS/09/05/rehnquist.funeral.ap/index.html?section=cnn_topstories

Castro: U.S. hasn't answered aid offer
http://www.cnn.com/rssclick/2005/WORLD/americas/09/05/katrina.cuba/index.html?section=cnn_topstories

Indonesia jet crash kills 147
http://www.cnn.com/rssclick/2005/WORLD/asiapcf/09/05/indonesia.plane.update.ap/index.html?section=cnn_topstories

Copter drops concrete on cable car in Austria
http://www.cnn.com/rssclick/2005/WORLD/europe/09/05/austria.cablecar/index.html?section=cnn_topstories

There are several ways you could extend this code. One option would be to have a two-phase pass with the modules. In the first pass you hand the REXML document to each parser to see if it wanted to handle it. Then in the second pass it’s handed to the one that thinks that it can handle the document properly. That way the application doesn’t actually have to know what the format is of any particular feed.

Recommendations

Here are some tips for potential modular architecture builders:

Use the architecture for the application itself
Don’t just reserve the modular architecture for user-contributed modules. If the modular architecture extends user notifications, write all of the notification code as modules. This ensures that the API is thorough and tested.
For complex modules, use directories
If you expect that modules are going to be complex or have lots of associated assets, put the modules into their own directories. This will make it easier to maintain and version them.
Handle pathing
With dynamic languages pathing can be a problem. I recommend altering the path to add the directory that contains the module code before loading the module. That will allow the module to require in its own code. The module writer should never be expected to write all of their code in one file, or to handle their own include path.
Provide a callback object
If the relationship between the module and the host application is bi-directional then the host application should pass in a proxy object that provides access to the functionality required by the module. This will allow the application code to change form as long as the proxy object API remains the same. It also provides a clear contract between the module and the application which will allow other applications to re-use the same modules.
Version
Version both the modules and the API. The code shown here doesn’t do that since I wanted to keep it simple. But for any production code you should support version numbers and only attempt to work with modules that support the current version number or earlier versions.
Host applications should handle portability and pathing
The host application should handle any path manipulation or portability work for the modules. This will ensure that modules can run on any operating system or environment without additional code.
Keep it simple
The role and life-cycle of a module should be very well defined within the system. And that role should be fairly well constrained. It’s far better to have several module standards that work with various portions of the system rather than one über module that has access to everything. Such modules are too easily broken when the host application changes it’s functionality during an upgrade.
Be aware of the complexity cost
Creating a modular application opens up a world of possibility for your application. But that flexibility always comes at a complexity cost. Creating a full-featured module development environment means building quality APIs that are easy to understand and are flexible enough to handle most potential use cases. It also means putting in enough debugging and error handling support to make it easy to develop modules. All of that is time and effort and it’s worth ensuring that the system will be used before going through what it takes to develop it completely.

I could easily write several articles with just recommendations for modular architectures alone. I’ve written a few and they have been more or less successful. I have also written to various modular architectures and have seen what works and what doesn’t. The common element in all successful modular architectures is thoughtfulness. Thoughtfulness in the design of the API, as well as in the care used in creating it and in mentoring those that use the API.

Conclusion

Modular architectures provide an opportunity for your customers to extend your application for their environment. For complex or highly customizable applications this can be a primary requirement. Ruby's facilities for dynamic code loading makes modular APIs convenient to write.

Talk back!

Have an opinion about modular APIs in Ruby? Discuss this article in the Articles Forum topic, Modular Architectures with Ruby.

Resources

[0] The Apache Web server:
http://apache.org

[1] JavaDoc, Sun's comment processing system for Java:
http://javadoc.sun.com

[2] The Eclipse IDE:
http://eclipse.org

About the Author

Jack Herrington is the author of Code Generation in Action (Manning, 2002) and Podcasting Hacks (O’Reilly, 2005), as well as over thirty articles. He is currently working on PHP Hacks to be released later this year. He is a full-time engineer with over twenty years of experience. Not all of which has been with Ruby. ;-) He lives with his wife Lauren and daughter Megan in the San Francisco bay area. He feels odd writing about himself in the third person.

<<  Page 4 of 4


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us