Dare Obasanjo on attention.xml and collaborative filtering: "Once one knows how to calculate the relative importance of various information sources to a reader, it does make sense that the next step would be to leverage this information collaboratively. The only cloud I see on the horizon is that if anyone figures out how to do this right, it is unlikely that it will be made available as an open pool of data. The 'attention.xml' for each user would be demographic data that would be worth its weight in gold to advertisers." Collaborative filtering alledgedly only works if you have a critical mass of items of interest and users to cross-reference. I heard once this needed to be in the low 1000s to ensure reasonable precision. That was back in 2000, by which time people had figured out how to process large in-memory datacubes in close to real time (ie updates occuring between user sessions). That's on the server. What we're not doing is considering how filtering might work on the client. When more specific information about the user is available, it's possible to optimize these algorithms to work with much smaller data sets, and in general to think about different algorithms or hybrid approaches. And it's probable the results can have higher relevance for the user. Commercially, collaboration has worked best for targeting mass goods for individuals, which is why it works well for Amazon. But the choice of algorithm varies based on the nature of the data (a lot of this stuff tends to be fantastically sensitive to the data and how the data is represented). Think about how useless a Bayesian spam filter would be aggregated across a 100,000 user data set up on Bloglines. It could be much better to work against a couple of users you trust and some candidate data of your own to seed the algorithms. "By the way, why does every interesting wide spanning web service idea eventually end up sounding like Hailstorm?" Probably the reason they all start to sound like Hailstorm is because they all work on the basis that the computation has to be done on the server against large aggregate datasets. One place, one owner. Cue the consequent privacy concerns. A few years ago, when asked how the trust problem could be solved, a senior executive from Egg bank had an immediate answer - "Branding". The extent people will trust your organisation with their information is largely based on their current perception of your organisation. That's not quite the same thing as branding, but you get the idea. What do you do with all that information you're generating 24x7? How do you convert it to value? Today's answer is to sell it to the people who have something to sell or messages to tell. The money's not in whatever it is you're offering to users to gather up the data in the first place (like search) - the money's in the side effects. And while figuring out to turn the data into value for you or for those who want to sell something to the users, the users must not think they're been sold out. Or they're gone. Something of a highwire act - and you only get to fall once. It could be much more interesting to sell this technology directly to users for 5 dollars and let them run it on their phones against the data of their choice. To do that requires a certain amount of letting go of ways of doing things, right through from client-server technology to business models based on TV and print media. The current situation is hopelessly dependent on those systems of buying and selling. The social networking phenomenon is interesting insofar as it attempt to join users to users or rather than users to services to advertisers. The next step is to get those lumbering servers out of the way and let people interact directly. That will require more imaginative and disruptive business models....