This post originated from an RSS feed registered with Python Buzz
by Andrew Dalke.
Original Post: Ruminations about DAS2
Feed Title: Andrew Dalke's writings
Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml
Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.
I visited Gregg Helt at Affymetrix last week. He's in their
Emeryville office, which is only a few blocks from where he was when I
last met him at Neomorphic. Must have been late August 1999 since it
was just after the OiB conference in San Jose. I think that was also
the last time I drove out to the Bay Area instead of flying.
We chatted about a few things. The big one was the DAS2 proposal,
which got a 125 in the review, which I'm told is good. This is the
first time I've been on a grant since grad school, and I wasn't that
involved then. I enoyed the conversation and look forward to working
with him in Lincoln in a few months if this thing really does go
through. Should hear about that soon.
Gregg mentioned playing around a bit with the content negotiation idea
I proposed it during some of the the DAS2 RFC discussions. His point
was that he controls both server and client and they can collude to
return data in a more appropriate form, eg, in a form which is easier
for the client to process or which reduces the amount of data sent
over the wire. He's pretty happy with the result.
I'm glad to hear that. I haven't had time to experiment with it
myself so the benefits of conneg were mostly theoretical. I don't
think I had considered about Gregg's use case -- I thought about
switching between existing formats, and not his idea of letting
a client/server combination use a new format.
I think that idea will help encourage tool providers to support DAS on
the server. DAS ends up providing two things: a way to get, publish,
and ask questions about annotations, and a guaranteed minimal base of
how that data is presented, likely in XML. If a company has better
ways of exchanging the data they can still support it on top of the
existing DAS API. If we do it right, they should even be able to
include new types of queries. It also makes DAS more future-proof.
Gregg also suggested that the client and server could collude in
making the searches. The DAS/1 API returns all annotation data in a
query range, so if you change the range just slightly you end up
getting all the overlap data again. I want to change the result to
just return URLs for annotations which overlap the range. This calls
for extra trips to the server to get the actual data, but the client
can cache previous results, and persistent connections in HTTP/1.1
help quite a bit with performance of making multiple requests.
Gregg correctly points out the client can remember which ranges it
already asked about. When it asks the server for data in a new range,
it could include that range information. The server then omits the
data it now knows the client already knows about.
I think there are still advantages to having searches return a list of
URLs instead of the raw data. My approach is easier for naive clients,
since they don't need to track all the ranges. (But we could provide
libraries or example code to help out.) My approach is also more
cache friendly, which may help if there are many clients using the
server -- stick a Squid cache in front of the web server and let it
handle repeated requests for the same annotation instead of going
through to the DAS layer, which may require starting a CGI script and
making sme SQL calls to the back-end database.
I also talked with Gregg about using some of the discussions related
to RSS/Atom/Pie/Echo/whatever to help guide us when fleshing out the
DAS2 spec. Here's part of a followup email I sent to him on the
topic.
which is a proposed successor to RSS. Quoting from
http://bitworking.org/rfc/draft-gregorio-07.html
> AtomAPI is an application level protocol for publishing, and
> editing web resources. AtomAPI unifies many disparate
> publishing mechanisms into a single, simple, extensible protocol.
> The protocol at its core is the HTTP transport of an XML payload.
It has several other names, including Echo and Pie, or maybe
Atom is a specification of the Pie API. It's confusing.
http://webservices.xml.com/pub/a/ws/2003/08/05/salz.html
even says that it might not be Atom, because of trademark concerns.
I think Atom and the concepts discussed on the wiki at
http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss
has some bearing on the discussions we had for DAS2. Many
of the comments looked familiar :)
Salz' article also gives some interesting critique of Atom vs.
WebDAV and REST vs. SOAP. He's one of the Python SOAP developers
so is somewhat biased in that last comparison.
On a related note, Mark Pilgrim, one of the Atom authors,
rewrote MS's SOAP interface into a REST approach, at
http://diveintomark.org/archives/2003/09/08/msweb-rest
and has some comments comparing the two approaches.