I completely agree with Tim O'Reilly's assessment of the privacy concerns around Google GMail. Mark Nottingham has made similar observations. It's a red herring, that can be dispelled by explaining how email works. What I wanted to pick up was an almost casual observation Tim made on data migration: The big question to me isn't privacy, or control over software APIs, it's who will own the data. What's critical is that gmail makes a commitment to data migration capabilities, so the service isn't a one way door to the future. - Tim O'Reilly The free flow of data across applications just isn't happening today. Given the way we work and live, the tools are way behind our lifestyles, being stuck in the office-bound, paper-pushing metaphor. It is essential, and I think inevitable that the way we manage our information changes. The real subtext of APIs vs. Services, Object vs. Documents and the current GMail debate is accessibility of information. Outside the consumer space of GMail I work in integrating systems, using web technology, XML and SOA. "Systems Integration" is not really systems integration; it's information itegration. When you move, reroute and repurpose information, a business can benefit from either a new service or an existing service at lowered cost with heightened quality. That's what it's all about. It's expensive, but becoming less so, in part due to the commoditization of software through open source and standards. The expenses are two fold - first in opening flowing data from one system into another and second keeping that flow moving. Most business systems are not designed with data accessibility in mind. Yes, we've had XML for years, but XML's real impact so far has been in the protocol/plumbing space. It's true value for data has not been realized - all that XML lying around is as yet, untapped potential. I'm telling you this, because I think what's happened in business systems over the last 5 years can inform what's going to happen in the consumer space. To appreciate the consumer application space, replace "systems" with "applications" and "business" with "customer" in the above paragraph and we might see that things are going to change. The same commoditization of data formats will have consequences for current consumer software business models, just as the commoditization of plumbing and networking has had for business systems. Tim again: The ability to search through my email with the effectiveness that has made Google the benchmark for search. How many times have people asked, "When can I have Google to search my hard disk?" That's a hard problem, as long as it's just your disk, on your isolated machine. But it's solvable once Google has lots and lots of structured data to work with, and can build algorithms to determine patterns in that data. Gmail is Google's brilliant solution to that problem: don't search the desktop, move the desktop application to a larger, searchable space where the metadata can be collected and made explicit, as it is on the web. I don't know - this is exactly what stops it looking like innovation to me. I agree with Jeremy Zawodny that this is incremental improvement. Google are ultimately loading your email into a database, as they've done with the Web. It's a centralized model, not an edge model. To me that's highly dissonant with the network OS vision and as such it looks perhaps precarious for Google given the cost-center implications of their supercomputer cluster - on what basis do they compete with peered search? And it's just email - personally, email is a fraction of the information I need coherency for. We need search across protocol and application information space. I believe anyone who can distribute search to the edges instead of centralize it will win. Where Google is showing huge innovation is in technology management - by the sounds of things the ratio of admins to servers over there is impressive. You really, really, want these guys running your data centres. But this is much like pointing out that where Amazon innovated was not technology but logistics management combined with a new sales channel. A geek like myself can only get so excited about optimizing the data centre ;) Tim mentions Chandler with regard to the desktop and sounds almost disappointed. But open source will be a key driver in making data liquid and application independent because open source is where the momentum for change will be maintained. What needs to happen here as much as anything is to get the open source community, especially the Java community given the portable nature of Java, off its collective server-sided networking backside and start looking at users and their needs. We don't need any more web frameworks. Projects like Chandler, and less directly Eclipse, are the start of that. Like using open source for middleware, it will be only advantageous for some prime movers - if everyone does it then the economics change drastically and companies whose model is traditional product and not services will start to feel the pain of lower and lower margins until they rethink what it is they do. There's no incentive to do this in the commercial world, except to gain a temporary advantage over the competition or make some soothing marketing noises to consumers. To really do it, to really make the strategic decision that the application franchise is secondary to the users information needs and execute on that vision, will require a ground up rethink, new technology, new business models, new partners, the lot. The industry just is not going to do this of its own accord. Data lock-in is a cash cow. If Microsoft ever moves against Google it may be in part because moving your data to a Google cluster from the desktop has implications for the windows and office franchise. But all we've really done is switch from one centralized model to another. Further innovation I believe will come from open source. Think of it - just when the middleware industry starts to get its head around the impact of open source, along comes a new wave of demand for commoditization in the consumer space, that goes straight to the bottom line. The main difference is that consumers don't quite realize how bad a deal they have - but they'll figure it out, just like business did. When it happens, consumers aren't going to care about your profit model, and might even delight in it given how much they've been charged all those years for buggy expensive software. As for all that space. A GB of disk space is roughly a dollar (and who knows how less much at the bulk Google will buy at). And 1GB - when you have it, it isn't enough. But again, this makes sense given Google's management innovations - they could probably have gone to 10Gb. And then there's the structure of the data itself. If you want your data to travel into the future with you, insist on RDF or Topic Map compatibility. Those formats are independent of the highly transient application formats and the less transient protocol formats. Yes, they're associated with all that Semantic Web handwringing about ontologies and the like, but in that event a transformation or a script is as good if not probably better than agonizing about the perfect information model....