One of the things that an aggregator allows you to do is keep up
with a lot more information flow. As I said earlier today, I
subscribe to 315 different feeds (44 of those are search feeds). I
figured it might be interesting to see how much new content there
is in a day from the non-media, non-search (i.e., mostly bloggers)
feeds that I track. So, I opened up a workspace in
BottomFeeder and started hacking out a script:
rejects := #('*feedster*' '*blogpulse*' '*google*' '*yahoo*' '*amazon*' '*icerocket*' '*rocketnews*' '*pubsub*' '*blogniscient*' '*digg*' '*sans*' '*infoworld*' '*computerworld*' '*linux*' '*slashdot*' '*wired*' '*rss.com*' '*internetnews*' '*comics*' '*file://*' '*technorati*' '*techrepublic*' '*meetup*' '*memeorand*' '*espn*' '*cnn*' '*extreme*' '*wbal*').
today := Date today asTimestamp.
basicFeeds := RSSFeedManager default getAllMyFeeds reject: [:each |
(rejects detect: [:each1 | each1 match: each url] ifNone: [nil]) notNil].
counts := OrderedCollection new.
basicFeeds do: [:eachFeed | | todays |
todays := eachFeed items select: [:each | each pubDateString >= today].
todays notEmpty
ifTrue: [counts add: eachFeed displayTitle -> (todays size)]].
sorted := counts asSortedCollection: [:a :b | a value >= b value].
It's a pretty simple script - I grab all the feeds, filter out
the ones that are either media or search related, and then see
which ones have content today. Then I slam the results into a collection, sort by frequency, and
do an inspect-it on the results. Unlike those *cough* advanced
*cough* languages in the mainstream, Smalltalk lets me do this at
runtime, in the running application. Kind of cool :) Anyway, I
wrote a quick script to slap that stuff in an HTML table:
| Feed | Posts |
| The Corner | 80 |
| MARS Activity | 35 |
| Daily Kos | 31 |
| Public Store | 28 |
| PCWorld.com - Latest News Stories | 26 |
| Bob Congdon | 25 |
| Sam Ruby's Comments | 22 |
| The Doc Searls Weblog | 22 |
| Taegan Goddard's Political Wire | 21 |
| ongoing | 20 |
| Eschaton | 20 |
| Samizdata.net | 17 |
| Lambda the Ultimate - Programming Languages Weblog | 17 |
| VodkaPundit | 16 |
| Cook Computing | 16 |
| Instapundit.com | 16 |
| Microsoft Watch from Mary Jo Foley | 15 |
| RSS News by CodingTheWeb.com | 15 |
| Radio Free Blogistan | 15 |
| Philip Greenspun Weblog | 14 |
| Dvorak | 14 |
| Web Things, by Mark Baker | 14 |
| Mark Bernstein | 13 |
| National Review Online | 11 |
| Exploration Through Example | 10 |
| lesscode.org | 10 |
| PragDave | 10 |
| Squeak People | 10 |
| MemoRanda | 10 |
| TalkLeft: The Politics of Crime | 8 |
| Little Green Footballs | 8 |
| N=1: Population of One | 8 |
| Sci Fi Wire | 8 |
| Media Blog | 8 |
| Sjoerd Visscher's weblog | 8 |
| Glenn Vanderburg: Blog | 8 |
| cst | 7 |
| Power Line | 6 |
| Scripting News | 6 |
| java.net Weblogs | 5 |
| Michelle Malkin | 5 |
| Sam Ruby | 5 |
| Science @ NASA | 5 |
| CincomSmalltalkWiki | 4 |
| Micro Persuasion | 4 |
| Traffic | 3 |
| cst comments | 3 |
| The Ornery American | 3 |
| The Indepundit | 3 |
| Don Park's Daily Habit | 3 |
| Cafe au Lait Java News and Resources | 3 |
| Larkware News | 2 |
| Travis Griggs - Blog | 2 |
| evhead | 2 |
| Dare Obasanjo aka Carnage4Life | 2 |
| Hugh Hewitt | 2 |
| Captain's Quarters | 2 |
| Joho the Blog | 2 |
| Corante Blog | 2 |
| Scobleizer - Microsoft Geek Blogger | 2 |
| Derek's Rantings and Musings | 2 |
| Alice Hill's Real Tech News - Independent Tech | 2 |
| Daypop Search - BottomFeeder | 2 |
| Software (Management) Process Improvement | 1 |
| Joi Ito's Web | 1 |
| Mark Watson's opinions on Java, AI, semantic web, and politics | 1 |
| d2r | 1 |
| planet squeak | 1 |
| Chris Pirillo | 1 |
| The Fishbowl | 1 |
| Rob Fahrni, at the core. | 1 |
| The Blog Ride | 1 |
| Windley's Enterprise Computing Weblog | 1 |
| Better Living Through Software | 1 |
| Steve Shu's Blog | 1 |
| Industry Analyst Reporter - Applications and Software News | 1 |
| Workbench | 1 |
| Matthew Yglesias | 1 |
| WCBS 880: Yankees on WCBS | 1 |
| cut on the bias | 1 |
| Austin Bay Blog | 1 |
| The Belmont Club | 1 |
| PVRblog | 1 |
| ScrappleFace | 1 |
| The Doctor is in | 1 |
| ARs closed Activity | 1 |
| Sam Gentile's Blog | 1 |
| Panopticon Central | 1 |
Now, I didn't get all the non-blogs out, but that's good enough for now - it's down to 89 feeds that way. The MARS one warrants some explanation - it's the feed off our internal bug tracking system, and we are approaching full code freeze for the next release - so activity is high. Other than that, the real outliers (i.e., lots of posts in a day) are group political blogs. Some of the high numbers are also some kind of server reset of the feed, not actual new content. That's still a problem that can fool an aggregator - especially when the feed in question doesn't have ID's for the items.
Anyway - looking at "real" results, it looks like a dozen new posts is a lot - most people are well under that. In fact, if I filter the list to those who posted 10 or fewer times so far today, I get down to 63 feeds. It turns out that the 7 (8 after this one goes up) posts today put me up near the top of that list. In fact, 23 of the feeds only have one new item so far today.
So - if you skim the high volume news/search feeds, the posts on single author blogs aren't that hard to keep up with. At least not if you use an aggregator :)