This post originated from an RSS feed registered with Scala Buzz
by David Bernard.
Original Post: Michael Galpin on Scala and XML, and some notes on xml.pull
Feed Title: Scala Blog
Feed URL: http://www.scala-blogs.org/feeds/posts/default?alt=rss
Feed Description: In an effort to realize the "grow together" spirit, the developers of lift have teamed up with other great minds in the Scala community to bring you Scala-Blogs.org. Our mission is to strengthen the community by sharing our experiences and knowledge so that others can learn from our mistakes and triumph along with our successes.
At Scala-Blogs.org you will find tutorials and articles written by a growing host of enthusiasts, each with a different background and area of expertise.
I wish he would also have gone into the scala.xml.pull package, but then I really should write documentation for it instead of leaving this to others.
The point of pull-parsing is of course to avoid building up an in-memory representation of the XML. We might want to avoid is because
the XML data is just too large, or
because we know that we are going to throw most of it away (garbage) and need the performance gain implied by not even allocating the garbage.
There are alternatives, for instance implementing the SAX-like MarkupHandler. However, MarkupHandler and SAX are examples of push parsing, and sometimes dealing with the sequence of events explicitly is more lucid than having some Handler class and managing state of the handler with lots of control variables.
In absence of more elaborate documentation, here is the example from the scaladoc of XMLEventReader.scala comment and from the test, in the hope they provide some insights in what pull parsing is. I should really give a more thorough example, but then there are excellent articles on the net describing pull-parsing (e.g. check out Elliotte Rusty Harold on Stax, from just a couple of years ago). That article also shows IMHO that pulling XML events is even more useful and readable when used with pattern matching. I will let somebody else drive that point home, here now the scala doc.
A pull parser that offers to view an XML document as a series of events.
object reader { val src = Source.fromString("") val er = new XMLEventReader().initialize(src)
def main(args: Array[String]) { Console.println(er.next) // print event for start tag hello Console.println(er.next) // print event for start tag world // ... } }