The Artima Developer Community
Sponsored Link

Java Community News
Brett McLaughlin: What is XML Really Good For?

81 replies on 6 pages. Most recent reply: Mar 8, 2007 3:37 AM by Antti Tuomi

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 81 replies on 6 pages [ « | 1 2 3 4 5 6 | » ]
James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 10:55 AM
Reply to this message Reply
Advertisement
> Honestly, have you ever considered other structured data
> formats for storing or transmitting data? ("structured"
> excludes .ini files and property files) Specifically, have
> you ever compared XML against ASN.1, JSON/YAML, s-expr,
> XDR, etc? Aside from the fact that there are 300 XML libs
> for every platform, XML never provides any technical
> benefit.

I can see your point but this is the exact technical benefit that matters to most people. It's one of those things that happened to be at the right place at the right time. It's unfortunate but a lot of technologies become dominant in this way.

> The reason is that it was designed as a Markup
> language (for text), not an Interchange language (for
> data).
>
> For example, with XML, it's ambiguous if you should put a
> value in an attribute or as text content:
> <first-name value="Chuck"/>
> <first-name>Chuck</first-name>
>
> The reason it's ambiguous is the designers didn't design
> it to be a data interchange language. As far as they were
> concerned, if you wanted "Chuck" to show up in the
> rendered text, you put it in as text content. If "Chuck"
> is just a property of the markup, you put it in as an
> attribute.

I'm not sure it was ever even 'designed'. Wasn't it developed as a (more-formalized) prequel to HTML with a lot of SGML baggage included?

> With a data interchange file, there is no rendered text,
> which is where the ambiguity comes from. (i.e., you're
> trying to do WHAT with my markup language?)

This is all well and good but if I want to use JSON and the guy on the other end of the phone says "Jason WHO? Can't you just send XML?" Guess what I'm going to send...

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 11:08 AM
Reply to this message Reply
> > > Ummm... this is a wild guess, but I'd say that XML is
> > > really good at marking up text documents. Hence the
> > term,
> > > Markup Language.
> >
> > And a 'head-shop' is a place where heads are sold.
>
> Michaels comment sounded a bit simplistic but was exactly
> my thinking. XML syntax is good for what it was originally
> created. XML comes from SGML which comes from GML and its
> purpose was:

I was just trying to get a little more discussion out of him and throw in some humor (I know that no one thinks I'm funny BTW.)

> XML is for documents. Why everybody wants to stuff all
> data into text documents nowadays is beyond my
> comprehension.

See my other reply to Michael. It's not a choice. It's a decision. In other words, people aren't going out to find the best interchange format and using it. Everything else is already mostly eliminated as having too little support.

You could make some really good arguments about why English is not the best language to have this discussion in. Complicated and inconsistent rules. Lots of context sensitive words. But even if you were able to convince everyone that another language was a better language for the discussion, it wouldn't matter because it's highly unlikely everyone involved in this discussion knows that other language and everyone does know English.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 2:16 PM
Reply to this message Reply
> XML is for documents. Why everybody wants to stuff all
> data into text documents nowadays is beyond my
> comprehension.

http://en.wikipedia.org/wiki/TJ-2

For those who really think XML is spanking new technology. And as my Pappy used to tell me: "just because everybody you know is jumping off a bridge, doesn't mean you should too".

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 5:47 PM
Reply to this message Reply
> > XML is for documents. Why everybody wants to stuff all
> > data into text documents nowadays is beyond my
> > comprehension.
>
> http://en.wikipedia.org/wiki/TJ-2
>
> For those who really think XML is spanking new technology.
> And as my Pappy used to tell me: "just because everybody
> y you know is jumping off a bridge, doesn't mean you
> should too".

On one hand, this is kind of like saying don't breath because the air is polluted.

But in terms languages adding XML syntax, I completely agree.

Jeff Ratcliff

Posts: 242
Nickname: jr1
Registered: Feb, 2006

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 6:09 PM
Reply to this message Reply
> Honestly, have you ever considered other structured data
> formats for storing or transmitting data? ("structured"
> excludes .ini files and property files) Specifically, have
> you ever compared XML against ASN.1, JSON/YAML, s-expr,
> XDR, etc? Aside from the fact that there are 300 XML libs
> for every platform, XML never provides any technical
> benefit. The reason is that it was designed as a Markup
> language (for text), not an Interchange language (for
> data).

I think the appeal of XML relates to ASCII.

Once upon a time there were a number of binary coding schemes to code alphanumeric information. The eventual winner was ASCII and it became so widespread and the tools to read/write/process it so common that some people forgot that it was a binary protocol and started referring to it as human-readable, although clearly it was 01101110 01101111 01110100 00100001

For some odd reason there arose a kind of programming cultural quirk (particularly in the Unix community) that frowned on using any tool for data definition that uses a "non-human-readable" format (i.e. non-ASCII). It was OK to require the user to use new tools (e.g. a browser) but not the developer.

For that reason, something like ASN.1 can't compete with XML because it doesn't use the ASCII binary format.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 6:22 PM
Reply to this message Reply
> > > XML is for documents. Why everybody wants to stuff
> all
> > > data into text documents nowadays is beyond my
> > > comprehension.
> >
> > http://en.wikipedia.org/wiki/TJ-2
> >
> > For those who really think XML is spanking new
> technology.
> > And as my Pappy used to tell me: "just because
> everybody
> > y you know is jumping off a bridge, doesn't mean you
> > should too".
>
> On one hand, this is kind of like saying don't breath
> because the air is polluted.

More along the line: a turnip makes a lousy version of a carburetor.

And thus: even if the Outside World wants some XML, doesn't mean you ought to let it infect *your* data system.

>
> But in terms languages adding XML syntax, I completely
> agree.

nes

Posts: 137
Nickname: nn
Registered: Jul, 2004

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 7:34 PM
Reply to this message Reply
What irks me, applications for which it is most useful are mostly ignored (thankfully that is changing). Most existing text documents are in proprietary binary (Word) formats or pdf or postscript or Tex. A standardized XML format would be nicer (I wish ODF was good enough to use it for print). Ditto a format for sheet music: MusicXML. Vector graphics: SVG. Technical illustrations and drawings: Visio? There are a dozens of document formats on a typical hard drive and many of those could probably benefit being saved in well formed XML.

My take on this.

Documents:
XML

Raw text files:
configuration files, source code or programming languages, command files, etc; files that usually have to be edited raw. Use a format or language that is friendly to the eye in vi and notepad. For configuration I happen to like 'ini' files. Possible exception: configuration files multiple pages in length to be modified by a configuration tool (interface).

Structured data:
For database dumps, records, lists, etc. Use DSV (delimiter separated values).

Parameter passing:
RFC822, JSON

Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 21, 2007 8:04 PM
Reply to this message Reply
My perspective about XML changed perhaps in the opposite direction to others' who commented before. I used to really dislike XML, thinking it was just a terribly verbose way to define data. I couple of things happened that made me change my mind, and to think now that XML is actually pretty useful for both data storage and data interchange.

The first is the emergence of excellent APIs in almost every language I know of. Recently, for instance, I was looking into ActionScript (EcmaScript, really), and it supports E4X, which is a nice way to slice and dice XML data, perhaps convert that XML into objects and back. In Java, there are so many outstanding APIs for XML that working with XML documents is not the pain it used to be.

Another reason I changed my mind is that processing XML is actually quite fast now, thanks in part to the API implementations, but also to the fact that computers are just faster in general, networks are faster, and storage is plentiful.

But the main reason I tend to think of XML in milder ways is because the alternatives for certain kinds of uses are not very attractive.

One example is data persistence. Having now worked with a variety of application domains, I have to say that lots of domains deal with truly hierarchical data, and people tend to shoehorn that sort of data into relational schemas, perhaps because whether to use a relational database for persistence is not a question many ask. O/R mapping tools have alleviated the need to think about that shoehorning too much, since they can pretty painlessly project hierarchical data in a relational form, and vice versa.

For some kinds of databases, however, it might actually be better to think of XML as a hierarchical data projection, and bypass O/R mapping (either by generating XML directly from a relational database via XQuery, or by using XML-to-object serialization, as provided, for instance, by the excellent XStream library).

And even XML file storage for persistence may be an option in some cases. I recently spoke with some folks who work with XQuery a lot, and learned that working with several-gigabyte XML documents has become feasible with some XQuery implementations, and that it's even faster in some cases than generating equivalent relational queries, because the XML can be indexed, too.

Also, for configuration, what scares me about formats that rely on indentation is that one day someone opens up such files in Notepad and thereby removes all those nice indents, effectively destroying the data. XML is a bit more durable in that regard.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 22, 2007 6:17 AM
Reply to this message Reply
> > > > XML is for documents. Why everybody wants to stuff
> > all
> > > > data into text documents nowadays is beyond my
> > > > comprehension.
> > >
> > > http://en.wikipedia.org/wiki/TJ-2
> > >
> > > For those who really think XML is spanking new
> > technology.
> > > And as my Pappy used to tell me: "just because
> > everybody
> > > y you know is jumping off a bridge, doesn't mean you
> > > should too".
> >
> > On one hand, this is kind of like saying don't breath
> > because the air is polluted.
>
> More along the line: a turnip makes a lousy version of a
> carburetor.

That really doesn't apply to this situation. If you are running a business and want to buy and sell things electronically you are going to be using XML. It doesn't matter if it's the worst choice. If you want to do business, you need to use it.

Now if you are saying that you XML is a bad choice for configuration, DSLs, etc., I agree completely.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 22, 2007 6:18 AM
Reply to this message Reply
> One example is data persistence. Having now worked with a
> variety of application domains, I have to say that lots of
> domains deal with truly hierarchical data, and people tend
> to shoehorn that sort of data into relational schemas,
> perhaps because whether to use a relational database for
> persistence is not a question many ask.

Most data in the world is not, upon analysis, hierarchical. This is the root of the mass stupidity that took hold of the world around 2000.

ID/IDREF was the (first?) attempt to accept this fact. Read "If you liked SQL, you'll love XQuery" at dbazine
http://www.dbazine.com/ofinterest/oi-articles/pascal19

Dr. Codd devised the RM just because IMS (the XML of its day; and still around) was such a bad deal.

Not to make a flame war; but folks need to think about this stuff, not just jump on some bandwagon. We are supposed to be smarter than the masses, which means not regressing when the masses think we should.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 22, 2007 6:22 AM
Reply to this message Reply
> What irks me, applications for which it is most useful are
> mostly ignored (thankfully that is changing). Most
> existing text documents are in proprietary binary (Word)
> formats or pdf or postscript or Tex. A standardized XML
> format would be nicer (I wish ODF was good enough to use
> it for print). Ditto a format for sheet music: MusicXML.
> Vector graphics: SVG. Technical illustrations and
> drawings: Visio? There are a dozens of document formats on
> a typical hard drive and many of those could probably
> benefit being saved in well formed XML.

I've been trying to get comfortable with DocBook which is now XML based but I haven't found a good WYSIWYG editor yet.

Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 23, 2007 5:46 AM
Reply to this message Reply
> > One example is data persistence. Having now worked with
> a
> > variety of application domains, I have to say that lots
> of
> > domains deal with truly hierarchical data, and people
> tend
> > to shoehorn that sort of data into relational schemas,
> > perhaps because whether to use a relational database
> for
> > persistence is not a question many ask.
>
> Most data in the world is not, upon analysis,
> hierarchical. This is the root of the mass stupidity that
> took hold of the world around 2000.
>
> ID/IDREF was the (first?) attempt to accept this fact.
> Read "If you liked SQL, you'll love XQuery" at dbazine
> http://www.dbazine.com/ofinterest/oi-articles/pascal19
>
> Dr. Codd devised the RM just because IMS (the XML of its
> day; and still around) was such a bad deal.
>
> Not to make a flame war; but folks need to think about
> this stuff, not just jump on some bandwagon. We are
> supposed to be smarter than the masses, which means not
> regressing when the masses think we should.

Actually, data are neither hierarchical, nor relational. Data are functional, i.e. relations between data are defined by a mathematical function.

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 23, 2007 1:51 PM
Reply to this message Reply
/* It is good for encoding data that is complex enough to require a context-free grammar to understand, but which is not so often hand-coded that it's worth applying compiler technology such as lex and yacc.
*/

OK, so that was a few posts back. Anyhow, XML seems to be most useful in keeping people from coming up with file formats without considering trouble that may crop up down the road (for instance, basing a format on XML makes it much easier to add versioning information next release when you realize you overlooked that part).

To the extent that XML reduces the number of hand-rolled file formats out there, the better. To the extent that it's used in applications like Ant's build files, well I'm not impressed with those applications.

Charles McKnight

Posts: 3
Nickname: cmcknight
Registered: Dec, 2005

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 23, 2007 2:38 PM
Reply to this message Reply
Given the somewhat verbose nature of XML and that any significantly large XML document is not human-readable, I'd venture that it might be good for storing data requiring semantics that will be transformed into another format for consumption. Using it as a de facto standard for transmitting relatively small amounts of data is just wasting bandwidth and imposing additional overhead to marshall/unmarshall the data, IMHO. Unless I'm releasing the data for consumption by a number of different consumers, there are more efficient ways to transmit data. One place that seems to be a good fit is for storing platform independent models that will be transformed into platform specific models in MDA. Unfortunately, most programmers seem to think that this sort of code generation is evil and only want to create the method signature....

Sigh.....

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Brett McLaughlin: What is XML Really Good For? Posted: Feb 24, 2007 3:14 PM
Reply to this message Reply
> Actually, data are neither hierarchical, nor relational.
> Data are functional, i.e. relations between data are
> defined by a mathematical function.

Data models (and XML/IMS has none; CODASYL sorta does) are independent of the data. The model, if there is one, specifies how to represent the relations.

hierarchical: one parent to a child, hardcoded, no defined model (algebra or calculus or similar)
CODASYL/network: many parents to a child, hardcoded, no defined model (algebra or calculus or similar)
RM/SQL: any number of connections, none hardcoded, built from algebra and calculus

The *MLs are built to do markup, not to represent data. That folks have seen fit to is only a reflection on them, not on we relationalists who complain.

Flat View: This topic has 81 replies on 6 pages [ « | 1  2  3  4  5  6 | » ]
Topic: Martin Fowler Takes ANTLR for a Test Drive Previous Topic   Next Topic Topic: Desklets for Java

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use