This post originated from an RSS feed registered with Ruby Buzz
by Florian Frank.
Original Post: Benchmarking: Sorting Arrays in Descending Order
Feed Title: The Rubylution: Tag Ruby
Feed URL: http://rubylution.ping.de/xml/rss/tag/ruby/feed.xml
Feed Description: Starts… Now.
Recently i wanted to sort an array of array pairs (array.size == 2)
in descending (highest value first) order in Ruby. I wanted to avoid creating a temporary array for this, so at first I used Array#sort with a block.
Because the sorting would often be called in my algorithm, I wanted to optimize this part of the code a bit. I started to benchmark in order to verify, that my assumptions are correct.
array sizes are n, a is my array of pairs, a2 is for comparison only, and I repeat the benchmarked code m-times. (I also do a full warmup run before that.)
This was my first stab at the sorting (I don't care about the first value in the pair, BTW):
a.sort{|x,y|y<=>x}# => " 11.62s 0.011618c/s"
This took quite some time. (The first time value is the duration in seconds, the second the calls per second.) I guessed that calling the block ca. n log(n) times, was the reason for this, so I tried:
b=a.sort;b.reverse!# => " 3.29s 0.003290c/s"
Wow! Much better. This takes ca. k n log(n) + l n "machine instructions", while l << k, so the reverting doesn't matter neither in theory or praxis. This can be shown, by just benchmarking Array#sort:
a.sort# => " 3.48s 0.003484c/s"
This has essentially the same speed as the benchmark with reverse. Actually it's slower, which means, that I cannot even measure Array#reverse's speed in this benchmark.
So let's test a2 to find out, how fast comparing Fixnums is compared to the more complex pair Arrays:
a2.sort# => " 0.27s 0.000265c/s"
That's quite a speedup, so let's try to profit from that:
Hmm, this didn't work out. But why? I remembered that Ruby's sort cheats a bit for immediate values like Fixnum, that's most likely the reason. (But I am to lazy to look it up just now.) Making the sort block more complex than in the first try didn't help either.
But what can be gained from these examples is that
using a complex compare block, leads to a huge slowdown in sorting time and
comparing more primitive values can be much faster (especially if all of this happens in C).
This convinced me to try Array#sort_by, which I dismissed from the beginning as being too wasteful of memory (and maybe too slow because of this anyway):
LOL! I really should stop making predictions without benchmarking: I almost always guess wrong, if I do it. This is the fastest way to sort a in descending OR ascending order. Because it combines Ruby's cheating with Fixnums (-x.last) and only calling the { |x| -x.last }-block n-times instead of calling the more complex one ca. n * log(n) times.
On the other hand it's only a good idea to invest much time into figuring out and exploiting implementation details like this, if you need the speed, need it now and on this Ruby implementation. The used implementation could change in the future or the advantages could turn into drawbacks if you run the algorithm on JRuby for example.