I received a request for a screencast on XPath earlier this week - I have one queued up for next week, but I thought I'd go through a small example in the meantime. I created a couple of Smalltalk examples based on the XPath stuff here. You'll want to load the XPath parcel from the Parcels directory first. Below you'll see the XML file they use:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
So here's the first example - using an XPath expression, you want to get all the book titles from the document. Here's the expression to use (going into XPath syntax is beyond the scope of this post):
string := '/bookstore/book/title'.
expression := XPathParser new parse: string as: #expression.
That will give you an XPath expression. Now assuming you've saved the XML in 'bookstore.xml', you use this code to get an XML document:
parser := XMLParser new.
parser validate: false.
doc := parser parse: 'bookstore.xml' asFilename contentsOfEntireFile readStream.
You can also hand the XML parser URIs, if you want to grab the file from a website. Now you're ready to apply the expression:
result := expression
xpathValueFor: doc root
variables: Dictionary new.
result sortedNodes do: [:each |
Transcript show: each xpathStringData.
Transcript cr].
The result should be an XPathNodeContext. Iterating over the nodes gives you each matching one (in this case, titles). For the example, I'm simply dumping it to the Transcript. This code will work with any viable XPath expression; for instance, this one:
string := '/bookstore/book[1]/title'.
Now, it turns out that you can simplify this a little. If you go to the Public Store Repository and load XPath-Convenience, you can then do these two examples like this:
string := '/bookstore/book/title'.
expression := XPathParser new parse: string as: #expression.
parser := XMLParser new.
parser validate: false.
doc := parser parse: 'bookstore.xml' asFilename contentsOfEntireFile readStream.
results := doc xpathLocate: expression.
results do: [:each |
Transcript show: each xpathStringData.
Transcript cr].
and:
string := '/bookstore/book[1]/title'.
expression := XPathParser new parse: string as: #expression.
parser := XMLParser new.
parser validate: false.
doc := parser parse: 'bookstore.xml' asFilename contentsOfEntireFile readStream.
results := doc xpathLocate: expression.
results do: [:each |
Transcript show: each xpathStringData.
Transcript cr].
That cuts out a fair bit of code that is repetitive; the rest of it can easily be packaged up into your own convenience method in an application.
Will give you the title of the first book in the document. That should be enough to get you started; I'll have a screencast covering this soon.
Technorati Tags:
xpath, xml