The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Making concurrency simple, and a multi-threaded downloader in 4 (long) lines

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Eigen Class

Posts: 358
Nickname: eigenclass
Registered: Oct, 2005

Eigenclass is a hardcore Ruby blog.
Making concurrency simple, and a multi-threaded downloader in 4 (long) lines Posted: Apr 14, 2006 8:59 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Eigen Class.
Original Post: Making concurrency simple, and a multi-threaded downloader in 4 (long) lines
Feed Title: Eigenclass
Feed URL: http://feeds.feedburner.com/eigenclass
Feed Description: Ruby stuff --- trying to stay away from triviality.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Eigen Class
Latest Posts From Eigenclass

Advertisement

You might not have noticed it, but every page on eigenclass.org lists the most popular referrers. I often find interesting things in the Referer field, but unfortunately they are hard to find (especially for an occasional visitor) in the middle of unaccessible pages (bloglines, google reader, other online RSS aggregators...) and (as of late) referrer spam.

I'm now filtering referrer URLs as I get them, but I also wanted to purge the historical data contained in the "referrer database". Unsurprisingly, I wrote a script for that.

Filtering referrers entails a fair bit of network traffic, to fetch the referring URLs and verify that they can be accessed and seem legitimate. Performing these checks serially would take forever (establishing the connection, issuing the HTTP request, waiting for the data, timeouts, ...) and I wouldn't be utilizing my bandwidth efficiently.

The obvious solution is performing several operations in parallel to maximize bandwidth usage.

Pooling handlers

The idea is creating a PoolingExecutor object that assigns tasks to a bounded number of handlers and runs them in separate threads. This way we optimize the use of some limited resource (in this case, bandwidth, but it could also be DB connections, etc...) --- since we're not CPU-bound, while avoiding an overload.

The API is:

executor = PoolingExecutor.new do |handlers|
  NUM_HANDLERS.times do 
    handlers << SomeHandler.new(stuff)
  end
end

# later

# each task is run in a different thread, but the num of simultaneous
# threads is bounded
executor.run do |handler|
  # perform task with the handler
  # e.g.
  foo( handler.process(stuff) )
end
executor.run do |handler|
  # ....
end
executor.wait_for_all # ensure all the tasks scheduled with executor are
                      # finished
# ....


Read more...

Read: Making concurrency simple, and a multi-threaded downloader in 4 (long) lines

Topic: Quick quotes on ruby and lisp Previous Topic   Next Topic Topic: Morning

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use