|
|
|
Sponsored Link •
|
|
Advertisement
|
Bill Venners: Do you have any general guidelines for designing an XML schema, designing the data structure?
Elliotte Rusty Harold: The main thing I would say is: grow your documents organically. Try and model the actual content for which you're writing a schema, and see what sort of XML structures come out. Don't start by writing schemas. Start by writing example instance documents, and see what you get.
For example, if you're modeling invoices, pull out a few invoices. Ask yourself, "If I wrote this invoice in XML, what it would look like? That invoice, what it would look like?" If you have a large and representative enough collection of previous documents—in whatever format: paper, electronic—you can get a good start. Then you will gradually discover other documents coming into your system that don't really fit your designs. They have a couple extra fields. One document has two shipping addresses instead of one, so you figure out how to handle that in your schema. Another document has an address that's in the U.K. instead of in the United States, and that has a very different format. So you adjust the schema.
If you grow your schemas organically, you gradually figure out how the
documents are likely to be structured. You don't write down in stone up front
that the documents must be structured like this, that all these elements must
be present, that these attributes must not be present if something else is
present, and so on. You let the actual information drive the design, rather than
letting the design constrain what documents you're willing to accept.
Come back Monday, October 13 for the first installment of a conversation with C++ creator Bjarne Stroustrup. I know I promised this last week, but one must always keep up some element of surprise. Nevertheless, look for Bjarne next Monday. He will be here, really. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.
Have an opinion about the design principles presented in this article?
Discuss this article in the News & Ideas Forum topic,
Organic Schemas and Outlier Data.
Resources
Elliotte Rusty Harold is author of Processing XML with Java: A Guide
to SAX, DOM, JDOM, JAXP, and TrAX, which is available on Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/020161622X/
XOM, Elliotte Rusty Harold's XML Object Model API:
http://www.cafeconleche.org/XOM/
Cafe au Lait: Elliotte Rusty Harold's site of Java News and Resources:
http://www.cafeaulait.org/
Cafe con Leche: Elliotte Rusty Harold's site of XML News and Resources:
http://www.cafeconleche.org/
JDOM:
http://www.jdom.org/
DOM4J:
http://www.dom4j.org/
SAX, the Simple API for XML Processing:
http://www.saxproject.org/
DOM, the W3C's Document Object Model API:
http://www.w3.org/DOM/
ElectricXML:
http://www.themindelectric.com/exml/
Sparta:
http://sparta-xml.sourceforge.net/
Common API for XML Pull Parsing:
http://www.xmlpull.org/
NekoPull:
http://www.apache.org/~andyc/neko/doc/pull/
Xerces Native Interface (XNI):
http://xml.apache.org/xerces2-j/xni.html
TrAX (Tranformation API for XML):
http://xml.apache.org/xalan-j/trax.html
Jaxen (a Java XPath engine):
http://jaxen.org/
RELAX NG:
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=relax-ng
|
Sponsored Links
|