My example for xe was a straightforward port of the test code, but really in xe you would probably not be building your XML structures by hand very often. Usually you would make a class to do it for you. With a class, you can set up sensible default values, check values to make sure they are legal, and so on.
I coded up a couple of classes to demonstrate this. See below. Like my first example, this is tested code; if you download xe you can run this.
One other thing about xe: I'm proud of the way it handles reading in XML data. You create an XML data structure, and you call the .import_xml() method. This then reads in the XML data, and tries to match things up. Where there is a match, it puts the value in your data structure; if there is no match, it will add a member to your data structure and put the data in there. Basically, you just describe the XML data you expect, and if your description is good, it will magically Just Work. This is especially cool because you can describe things like an Atom feed that have a list of 0 or more identical elements, and that will work too!
The .import_xml() method is based on openAnything() by Mark Pilgrim. It accepts a file-like object, a filename, a URL, or a string.
However, namespaces throw a monkey wrench right now. If you are reading in, say, an Atom feed, and you have bound "a" to the Atom namespace, then "a:title" should match with the "title" member in the data structure; right now that doesn't work at all.
Here's the new sample code.
lst_valid_carriers = ["FDXE", "UPS", "USPS"] class CarrierCode(xe.TextElement): def __init__(self, carr_code): if carr_code is None: carr_code = lst_valid_carriers elif carr_code not in lst_valid_carriers: s = ", ".join(lst_valid_carriers) raise ValueError, "carrier code must be one of: " + s xe.TextElement.__init__(self, "CarrierCode", carr_code)
>With a class, you can set up sensible default values, >check values to make sure they are legal
It seems like a number of the contributors to this thread are at a level of "schema awareness" similar to mine before joining CSIRO a couple of years ago.
With schema-aware processing tools there is no reason for user code to be setting up defaults or checking values. OK, there might be a good reason to have code generating XML that writes defaults rather than relying on the schema's declared defaults, but certainly value type and range-checking should be in the schema.
> With schema-aware processing tools there is no reason for > user code to be setting up defaults or checking values.
Is there a book or web page you recommend for learning more about this?
> OK, there might be a good reason to have code generating > XML that writes defaults rather than relying on the > schema's declared defaults
Example: a user's FedEx class that defaults to the user's account number and other user-specific shipping details.
> but certainly value type and > range-checking should be in the schema.
It might also be faster to have the code "know" the values to check rather than having to parse the schema each time you run your program... especially for trivial programs. That's just a guess, though, and I could be wrong.
There's a lot of stuff about design patterns for XML Schemas on the XMML wiki, which is an international collaboration (XMML working name now renamed GeoSciML partly due to confusion and a squatter on xmml.com).
I am not saying W3C XML Schema is particularly good, but it is sufficiently powerful and usable for rich data descriptions. One of the biggest headaches is that it allows more realistically flexible data descriptions than programming languages are easily able to deal with (that's one of the things I'm hoping to fix with CEDSimply).
(but please, no more "have you heard of XXX project, it's really wonderful", I'm sure it is, but unless its something that is doing it the same way I am, it is not really relevant. If someone out there already has this idea and a more mature codebase, I'd be pleased to drop this, and contribute to that.)
In fact, what I'm doing is collecting the classes defined within the XMLModel subclass, and instantiating them when an instance of the XMLModel is created, so all those sub classes become composite objects.
So you can create more than one instance, and use them, with out conflict.
> It might also be faster to have the code "know" the values > to check rather than having to parse the schema each time > you run your program... especially for trivial programs.
That sounds a bit like "it might be faster to have the code 'know' the values to check rather than relying on the database schema rules to enforce them." :-)
Schemas are referred to by instance documents, not incorporated. There's no reason why a processing system can't have a schema cached. You could have an architecture where validation against a schema was performed separately before calling user code.
I read somewhere that XML is just reinvented and *terribly* overengineered Lisp. In Lisp code is data and data is code, so mixing them is a normal thing. Languages in XML? (Jelly, anyone?) Why do you think there are so many dialects of Lisp? :)
I'm guessing but I think the reference is to the book "Pragmatic Project Automation" by Mike Clark (ISBN 0-9745140-3-9). It contains (on page 29) a page long explation by James Davidson (the author of Ant) about why he used XML in Ant. The article title is "The Creator of Ant Exorcizes One of His Demons". The full article ends with the following paragraph...
> ... > Another example is Ant. The creator of this tool has > since apologized for using XML > ... > > Do you have a reference to that? I couldn't find such a > public apology on the web.
The creator of Ant writes here about his regrets in (A) using XML and (B) not making Ant more powerful by incorporating enough language constructs. I agree wholeheartedly on both counts, and yet I'm not ready to undertake the project of creating a new build system, much as I would like to have a better one for my own use.