After a long absence, I'm back. With some explanations about why I've been gone so long...
It has been way too long since my last entry. As those of you who have
been watching Sun know, I've been living
in interesting times. But churn at the corporate level is neither new nor,
I find, all that interesting. The real reason that I haven't been writing
is that the work I've been doing has turned out to be pretty interesting,
so rather than write I've been thinking about it.
As some of you know, my group and I have been working on a distributed
system that could be used to collect and analyze medical sensor
data. Assume for a moment that sensor technology gets good enough that
we are all wearing some number of medical sensors-- keeping track of our
heart rate, and blood oxygen, maybe EKG, temperature, and the
like. Systems like this are being developed, and some simple ones are
already being sold for runners and other athletes.
Now suppose everyone is wearing these-- and I mean everyone. Say 300
million people in the United States. Something like this could lower the
cost of medical care, because your doctor wouldn't have to see you to
get basic triage information (and might even call you to tell you that
it was time to come in). Public health could be done real-time. And the
research data might turn medicine into a science.
But how could you store and analyze all of this information? That would
require a distributed system (to scale) that would need to be federated
(because that is the way doctors and medical systems work) that would
need to be highly secure, insure the right kinds of privacy, and last
for a long time. And that is the kind of system we have been thinking
The problems that need to be solved are hairy, to say the least. The
system, for example, would need to insure the privacy of the information
gathered, but also allow medical professionals to have access in an
emergency. Researchers and public health officials also need access, but
in some way that anonymizes the information sufficiently to preserve
privacy but not so much as to cloak important trends.
One of the most interesting of the problems is making the system last as
long as it needs to last. We decided that the overall system needs to
last as long as the patients whose data is being stored in the system;
this means a minimum of 70 to 80 years (and this doesn't count the
desire to keep the information around for research purposes). But this
means that the system needs to be able to change all of the pieces that
make up the system, and all of the information and objects need to be
able to move around (at least to a new machine) without breaking the
system. It also means that we need to design with the assumption that
everything (machines, operating systems, even programming languages)
will change. The versioning problem can't be ignored in this one.
We've made some progress already. Tim Blackman's deployment utilities came out of
this work (we are basing the current system on Jini, or all things). I'm
writing this on a plane taking me to JavaOne, where I'll be
talking about some of the early results. We hope to be sharing more in
the near future; we are working on everything from making classloading
in Java more coherent in the distributed case to environments for
Jini-style services to some interesting services themselves. And, of
course, on security, privacy, and auditing. Our hope is to publish a
lot, and release our code to the Jini community in open source form.
And I will also be talking about what we are doing (among other things)
here. I find the reactions of this crowd a good reality check for what I
am doing, so I look forward to hearing your opinions, feedback, and
Very interesting project you are working on. It reminds me of both the 'long now' project (http://www.longnow.org/) and Neuromancer :)
Anyway, you say:
"But how could you store and analyze all of this information?"
I was thinking, why not leave the data where it is generated (e.g. store it in the sensor) and only collect it when you need it? I know this seems like the world upside down, but beside introducing new and interesting problems it would solve some of the hairy problems you and your team are trying to deal with right now.
Besides, it would be closer to Gelernter's Mirrorworld vision: a real-time virtual representation of reality. Heck, it would even allow you to look into past versions or the Mirrorworld (please let me know when I'm drifting).
It won't work in the foreseeable future, but you're thinking about a loooong running project (from a human perspective), which changes the possibilities. (It may even be the case that simple sensors are full Jini services! Imagine that! But that's different discussion...:)
You should blog more. You seem to approach things in a sideways manner - with a fresh perspective. You explain complex topics with amazing clarity. Blogging is work, but rewards of your generosity can be large.
Jim - The case you described seems like it's just a domain specific instance of 'event stream processing' with some good collaborative filters hooked to a traditional data wareshouse (with historical archives) and a data mining engine.
It feels like this use case can be accomplished with off-the-shelf technologies. What am I missing?