This post originated from an RSS feed registered with Python Buzz
by Andrew Dalke.
Original Post: Connection negotiation
Feed Title: Andrew Dalke's writings
Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml
Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.
In other words, the server implements content-negotiation
(also called "conneg" for short).
This is something I advocated in DAS/2. The summary of my experience with
conneg is in the DAS/2 mail archive.
I don't closely track what's going on in bioinformatics these days so
Eric's comment is the first real public example I've seen of conneg in
use in the life sciences. Affymetrix uses it for some internal work,
and while the DAS/2 spec talks about it I don't know of anyone
publically supporting it.
I'm curious to know about people's experiences with conneg. It hasn't
had wide uptake in the general software world, so I haven't been
learned much about the practicalities of using it in real systems.
But I don't have a comment system so, hmm, well, email me about it or send
me a link to a page describing your experiences.
Getting a specific representation
One of the longer term problems in conneg is that only dispatches on
the requested format, and not the meaning. If you request an
"image/png" for a chemical compound do you get the 2D or 3D depiction
of that compound? But that's theoretical problems that are best
solved only after running into problems in practice, I think.
A more immediate problem I had with conneg was trying to link to a
specific format for a resource. For example, if you want to link to
specifically the RDF version in HTML you have to do something like
Take a look at the <a href="wherever" type="application/rdf+xml">RDF</a>
and in email you want to say
Why are the oxygens colored yellow in the PNG version at http://example.com/image ?
(I haven't tried that to see if the 'type' attribute actually works
like this in modern browsers. I just don't have real world experience
in using conneg.)
Eric solved that by using a several redirections, with different URLs
for each final representations. That is:
The first request to 'http://purl.uniprot.org/uniprot/P12345'
does a 303 "See Other" redirect to ...
'http://beta.uniprot.org/?query=purl:uniprot/P12345', which
understands the "Accept" header and does a 302 "Found" redirect
to either of:
'http://beta.uniprot.org/uniprot/P12345' for html, or
'http://beta.uniprot.org/uniprot/P12345.rdf' for RDF
I'm beginning to think that multiple final URLs (with an intermediate
redirect) is the right solution for this. Though I would like some
way for user agents to get from each final representation to the
other, or to the "main" URL. For example, I can't "Accept:
application/rdf+xml" on the HTML page and be redirected to the other.
Shouldn't there be an HTTP header for that?
The conneg spec has a section describing "alternatives", which looks like
It's meant so the server can inform the client (the "user agent") that
alternate forms are available, and let the client decide which is the
best form. It might be nice for the UniProt server to support this as
well, but that's definitely something to hold off doing until there
are clients that might actually use that data.
This is where the "chicken and egg" meets YAGNI. And I'm
on the YAGNI side of the balance. Except abstractly.
Quality
For me, the hardest thing in conneg was supporting the quality factor
in the request. A quality of "q=1.0" is best and "q=0.0" means "do
not want." If two content types are requested then the one with the
highest quality (after applying the scoring algorithm) wins. As long
as the best is not 0.
It looks Uniprot ignores the "q" field. I think that's reasonable for
now. It's annoying to get right and currently no one uses it.
Most likely the internal code is using the first format given rather
than the best format given. So remember boys and girls, always send
your prefered format first, even if the spec says there shouldn't be a
difference.
Hmm, and it does look like they do a substring test on the Accept
string. This should not (probably, debatably) return RDF.
All of these problems I just listed? They should be low on the list
of things to worry about doing conneg. It's a bunch of details that
aren't needed for the first pass at getting conneg working. They are
only needed once there are multiple format variants, and clients which
except to get the different format types, and which can handle
multiple formats.
Accept: */*
Eric also wrote:
The main reason why I don't default to the machine-readable
representation (no doubt that would be useful for people writing
semweb applications) is that the large majority of resources does not
have a machine readable representation, and web pages happen to be the
greatest common denominator.
Perfectly cromulent justification. Especially since semantic web apps
can be expected to send the correct content type, while browsers
(Safari; grrr!) do things like send "Accept: */*".