|
|
|
Sponsored Link •
|
|
Advertisement
|
I also have a problem with the get_type method on the factory. I think that in the long run I’m going to want more biographical information on each module. For example, the author, the module version, the description, inputs, outputs, etc.
Perhaps the easiest way to add biographical information to each module would be with a YAML encoded constant string attached to each factory class. This is shown on the RDF module below:
class RDFParser < Parser
def parse( xml )
# Parse the XML up and return some known format
return nil
end
end
class RDFFactory < ParserFactory
INFO=<<INFO
type: RDF
author: Jack
description: An RDF parser
INFO
def create()
return RDFParser.new()
end
end
I then add some code to the Parser base class that reads the YAML and implements not only get_type but also get_author, get_description and anything else I want:
require 'yaml'
class ParserFactory
...
def get_info()
return YAML.load( self.class::INFO )
end
def get_type()
return get_info()['type']
end
def get_author()
return get_info()['author']
end
def get_description()
return get_info()['description']
end
...
end
The code to get the constant from the subclass is pretty simple. The get_info method gets the class of the current object and gets the INFO method.
Having gone through all of the effort to build a modular architecture that reads various feed formats, it only seems fitting to actually implement one of them.
First the test code needs to actually get some RSS data:
require "net/http"
require "parse_mods.rb"
require "REXML/Document"
ParserFactory.load( "mods" )
rssp = ParserFactory.parser_for( "RSS" );
items = []
Net::HTTP.start( 'rss.cnn.com' ) { |http|
rss = http.get( '/rss/cnn_topstories.rss' )
doc = REXML::Document.new( rss.body )
items = rssp.parse( doc )
}
items.each { |i|
print "#{i.title}\n";
print "#{i.link}\n\n";
}
This code starts with loading the modules. The code then gets a parser for RSS. It loads the RSS from CNN and creates an REXML DOM model from it. That DOM model goes to the parser which creates an array of object structures that hold the title, link, and description.
The code for the real parser module is below:
require 'ostruct'
class RSSParser < Parser
def parse( xml )
items = []
xml.each_element( '//item' ) { |item|
link = ""
description = ""
title = ""
item.each_element( 'link' ) { |l| link = l.text.to_s; }
item.each_element( 'description' ) { |l| description = l.text.to_s; }
item.each_element( 'title' ) { |l| title = l.text.to_s; }
items << OpenStruct.new(
:link => link,
:description => description,
:title => title )
}
return items
end
end
class RSSFactory < ParserFactory
INFO=<<INFO
type: RSS
author: Jack
description: An RDF parser
INFO
def create()
return RSSParser.new()
end
end
It’s pretty simple. The code first iterates through all of the item tags, then within each item tag it finds the link, title, and description tags. With each of these it creates an OpenStruct object (part of the standard Ruby installation) and adds it to an array of articles which it returns.
The output on the day I wrote this article looks like this:
% ruby test.rb Pumps begin draining New Orleans http://www.cnn.com/rssclick/2005/US/09/05/katrina.impact/index.html?section=cnn_topstories Violence rages in Iraq hotspots http://www.cnn.com/rssclick/2005/WORLD/meast/09/05/iraq.main/index.html?section=cnn_topstories Rehnquist to lie in repose at Supreme Court http://www.cnn.com/rssclick/2005/POLITICS/09/05/rehnquist.funeral.ap/index.html?section=cnn_topstories Castro: U.S. hasn't answered aid offer http://www.cnn.com/rssclick/2005/WORLD/americas/09/05/katrina.cuba/index.html?section=cnn_topstories Indonesia jet crash kills 147 http://www.cnn.com/rssclick/2005/WORLD/asiapcf/09/05/indonesia.plane.update.ap/index.html?section=cnn_topstories Copter drops concrete on cable car in Austria http://www.cnn.com/rssclick/2005/WORLD/europe/09/05/austria.cablecar/index.html?section=cnn_topstories
There are several ways you could extend this code. One option would be to have a two-phase pass with the modules. In the first pass you hand the REXML document to each parser to see if it wanted to handle it. Then in the second pass it’s handed to the one that thinks that it can handle the document properly. That way the application doesn’t actually have to know what the format is of any particular feed.
Here are some tips for potential modular architecture builders:
I could easily write several articles with just recommendations for modular architectures alone. I’ve written a few and they have been more or less successful. I have also written to various modular architectures and have seen what works and what doesn’t. The common element in all successful modular architectures is thoughtfulness. Thoughtfulness in the design of the API, as well as in the care used in creating it and in mentoring those that use the API.
Modular architectures provide an opportunity for your customers to extend your application for their environment. For complex or highly customizable applications this can be a primary requirement. Ruby's facilities for dynamic code loading makes modular APIs convenient to write.
Have an opinion about modular APIs in Ruby? Discuss this article in the Articles Forum topic, Modular Architectures with Ruby.
[0] The Apache Web server:
http://apache.org
[1] JavaDoc, Sun's comment processing system for Java:
http://javadoc.sun.com
[2] The Eclipse IDE:
http://eclipse.org
Jack Herrington is the author of Code Generation in Action (Manning, 2002) and Podcasting Hacks (O’Reilly, 2005), as well as over thirty articles. He is currently working on PHP Hacks to be released later this year. He is a full-time engineer with over twenty years of experience. Not all of which has been with Ruby. ;-) He lives with his wife Lauren and daughter Megan in the San Francisco bay area. He feels odd writing about himself in the third person.
|
Sponsored Links
|