Java Buzz Forum - Scaling XMPP and Pub/Sub

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
Scaling XMPP and Pub/Sub

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Bill de hÓra

Posts: 1137
Nickname: dehora
Registered: May, 2003

Bill de hÓra is a technical architect with Propylon

Scaling XMPP and Pub/Sub

Posted: Oct 25, 2008 9:14 AM

This post originated from an RSS feed registered with Java Buzz by Bill de hÓra.
Original Post: Scaling XMPP and Pub/Sub Feed Title: Bill de hÓra Feed URL: http://www.dehora.net/journal/atom.xml Feed Description: FD85 1117 1888 1681 7689 B5DF E696 885C 20D8 21F8	Latest Java Buzz Posts Latest Java Buzz Posts by Bill de hÓra Latest Posts From Bill de hÓra

Jack Moffit: "Sorry, Twitter. Until we see some answers, you don’t have data, just a big mouth."

I think Jack Moffit, always excellent, is being hard on Alex Payne and the Twitter gang. He is criticising Twitter for restricting access to the firehose - the XMPP stream of events - "tweets" in Twitter parlance. Jack alludes to a strategic reason for this, as in - twitter 'own' the data and therefor should own the derivative value obtained from analysing or reorganising the data.

"I don’t know the exact time that they started pruning the list of consumers of the firehose, but to me it seemed like this starting happening after Summize was acquired or around that time. The logical conclusion from this is that Twitter does not want more interesting things being built on top of its data."

I'm guessing the reason is not just that, although we do know that Twitter will be announcing a business plan a few quarters out. The other reason might be scale.

Scale? It's received wisdom that heavy HTTP polling is stupid and wrong, whereas push is both more efficient and more optimal. The problem is there isn't much science or shared field experience on what it means to have a public XMPP data and notification endpoint with a lot of subscribers. When I say a lot, I mean 250,000 to 1M clients holding open connections to your server(s). Issues I've seen are that load balancing becomes a problem, db access costs dominate login times for clients, and XMPP server clustering isn't as far along as I'd thought it was. Scaling XMPP does not appear to be a commodity problem the way HTTP scaling is - you are back down to looking at whether/if the servers are using epoll/nio; whether load balancing should be done by clients (remember the load balancers actually get in the way), how long it takes to log a user in, set up presence, rosters etc; what the cluster toplogy's graph connectivity measure is (S2S doesn't seem to be the answer). It's like being back in 2000 and wistfully reading Dan Kegel's c10k page.

My suspicion is that services pushing out notifications to a number if subscribers (Sn) where that number is large is not yet a panacea to web poll scaling issues because there is latent asymmetry in the costs of pushing out events to increasing numbers despite it being more peformant and less latent for smaller values of Sn. And that service providers will need to look carefully at graph theory, flooding and gossip/propogation models to get pub/sub notifications to meet web scale delivery - and at that point we'll be half-way to either a peer to peer model, or usenet - take your pick ;)

Read: Scaling XMPP and Pub/Sub

Previous Topic

Next Topic


	Web Artima.com