The Artima Developer Community
Sponsored Link

.NET Buzz Forum
Aggregators that automatically download web pages

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Greg Reinacker

Posts: 276
Nickname: gregr
Registered: Aug, 2003

Greg Reinacker is president of NewsGator Technologies
Aggregators that automatically download web pages Posted: Dec 12, 2003 8:23 AM
Reply to this message Reply

This post originated from an RSS feed registered with .NET Buzz by Greg Reinacker.
Original Post: Aggregators that automatically download web pages
Feed Title: Greg Reinacker's Weblog
Feed URL: http://sedoparking.com/search/registrar.php?domain=®istrar=sedopark
Feed Description: Greg Reinacker's Weblog
Latest .NET Buzz Posts
Latest .NET Buzz Posts by Greg Reinacker
Latest Posts From Greg Reinacker's Weblog

Advertisement

This is a pretty common request for NewsGator:

Perhaps I'm missing something but I think that actually having a reader go out and retrieve the referenced news web page along with the summary feed is much more valuable...  Reading hundreds of news headlines is less useful when you are travelling, offline, etc.  as there is no way to get the actual content.

Wouldn't it be possible to add a feature that retrieves the referenced URL?

[NewsGator Forums]
Currently, NewsGator shows whatever is in the feed - nothing more, nothing less. If the feed contains full content, that what will be shown; if the feed contains only excerpts, that's what will be shown. In essense, we show whatever the publisher intended.
 
There are other tools that will go out and retrieve the contents of the web site at the link specified in the RSS item automatically at retrieval time (as opposed to viewing time), so it can be read offline, which is what's essentially being asked for above.
 
If the feed publisher really intended you to see the complete web page inside your aggregation tool, they could put the complete content inside the feed...then we would show that.  But often times they don't, obviously.
 
So we're caught between doing what the publisher wants (driving a click-through), or doing what the user says they want (scrape the page).  It's a tough call - we don't want to upset the publishers, as they're the ones providing the content...
 
There are also a number of downsides with a scraping mechanism.  It uses a sizable amount of bandwidth to retrieve all of these pages.  You may not even be interested in some of the pages, so they were retrieved for nothing, costing the publisher additional bandwidth.  Advertising stats on the publisher side will be skewed.  It's a tough call.
 
Any comments?

Read: Aggregators that automatically download web pages

Topic: TechNet Script Library Previous Topic   Next Topic Topic: How To: HTTP Compression in IIS 6.0

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use