Ruby Buzz Forum - Wide Finder 2: processing 42GB of httpd logs, 300X faster than naïve Ruby

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Ruby Buzz Forum
Wide Finder 2: processing 42GB of httpd logs, 300X faster than naïve Ruby

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Eigen Class

Posts: 358
Nickname: eigenclass
Registered: Oct, 2005

Eigenclass is a hardcore Ruby blog.

Wide Finder 2: processing 42GB of httpd logs, 300X faster than naïve Ruby

Posted: Oct 27, 2008 11:37 AM

This post originated from an RSS feed registered with Ruby Buzz by Eigen Class.
Original Post: Wide Finder 2: processing 42GB of httpd logs, 300X faster than naïve Ruby Feed Title: Eigenclass Feed URL: http://feeds.feedburner.com/eigenclass Feed Description: Ruby stuff --- trying to stay away from triviality.	Latest Ruby Buzz Posts Latest Ruby Buzz Posts by Eigen Class Latest Posts From Eigenclass

The Wide Finder 2 benchmark measures the speed at which a program can analyze 42GB worth of webserver logs and generate basic statistics (top URLs by hits, byte count and 404 errors, top clients and referrers) on a pretty beefy machine: a 8-core Sun Fire T2000 with 32 hardware threads and 32GB of RAM.

Several people have written a multitude of implementations in various languages, including C++, OCaml, C, Java, Scala, Groovy, Fan, Python, Perl, Ruby, and combinations thereof. The results are presented on this table.

Speedup

The reference implementation was written in Ruby and took over 25 hours to process the logs. The fastest versions, coded in C++ and OCaml, can do it in around 5 minutes, nearly 300 times faster.

In the case of OCaml, this speedup can be broken down into 3 factors.

Use of a more efficient language implementation: 1 order of magnitude

This does not necessarily cause an explosion in code size, as shown by the wf2_simple.ml semi-naïve port to OCaml, which is in fact shorter than the reference Ruby implementation by line count (this is far from being the case in C++, though), but runs an order of magnitude faster.

It is interesting to analyze the bang-per-buck ratio for the different languages: while two C++ implementations ultimately yielded the highest performance (a third one failed to parallelize properly, and was over three time slower than the top C++ or OCaml versions) , they do so at the cost of an explosion in the amount of code, requiring 2 to 3 times more lines than the OCaml version whose performance was within 10% of the best one, even though the C++ entries used libraries like Boost.

Parallelism: 1 order of magnitude

The Wide Finder 2 problem is fairly easy to decompose into smaller subproblems (essentially by having several workers processing different parts of the logs). The T2K is a bit particular in that each of its 32 hardware threads (which are seen as 32 CPUs when you program) is much less powerful than the x64 cores we're using most of the time, so parallel execution is a must.

I parallelized the OCaml program trivially by using a semi-standard 15-line higher-order function which accepts a function and executes it in another process.

Small-scale optimizations that apply to both the single-threaded and the parallel cases: 2-4X

The Wide Finder 2 implies lots of IO activity, which proved to be relatively hard to optimize on the T2K, because it you can easily saturate a core by doing mere IO (i.e., a single core is barely able to cope with the sustained read rate of the disk). While my first OCaml implementations used line-oriented IO, like the reference version in Ruby, the fastest one (like most other entries) uses block-oriented IO, which accounted for a large part of the linecount increase.

Conclusions

Optimization potential

Read more...

Read: Wide Finder 2: processing 42GB of httpd logs, 300X faster than naïve Ruby

Previous Topic

Next Topic


	Web Artima.com