This post originated from an RSS feed registered with Ruby Buzz
by Rick DeNatale.
Original Post: Algorithm Choice Trumps Programming Language Choice For Performance
Feed Title: Talk Like A Duck
Feed URL: http://talklikeaduck.denhaven2.com/articles.atom
Feed Description: Musings on Ruby, Rails, and other topics by an experienced object technologist.
My friend Brian Adkins just published an article on his blog comparing the performance of his Haskell program to find the longest palindrome in a string to a similar program in Ruby.
His Haskell program runs 25 times faster than his Ruby program. He reports that the Haskell program takes 7 times as long to process an input twice as long.
Brian's approach is to generate all of the substrings in the input string, then filter that list of substrings to those which are palindromes, and then output the longest one which passes through that filter. A cursory analysis indicates that this is O(n^2) which is in-line with his data.
I couldn't resist, so I sat down with my MacBook and my Sunday morning coffee and wrote my version of the program, in about 30-45 minutes.
My first thought was that we want to cut down on the search space, by only examining substrings that could be palindromes. By definition a palindrome starts and ends with the same character, so we only need consider such substrings. It took a few minutes to find a reasonable approach.
My basic idea was to start by looking for the initial substrings of the string ending with the first character, then do the same for subsequent characters. Then it occurred to me that I should find the LONGEST initial substring, since if it were a palindrome shorter substrings can't be longer.
In pondering the best way to do this in Ruby, and thinking about using a regular expression, I realized that rather than starting with the beginning of the string, it would be better to start at the end. I could then use Ruby 1.9's String.match(regexp, pos), to find the longest substring at the end of the string starting and ending with the same character, using the pos parameter to search for a shorter string when the last match is not a palindrome.
So in the end my algorithm examines each initial substring of the input string starting with the longest. For each of those it examines the final substrings which begin and end with the same character, again starting with the longest, and returns the longest of those which is a palindrome. The result of the overall program is the longest palindrome from any initial substring.
Here's the code:
TEXT = <<END
I'll just type in some example text here and embed a little
palindrome - A man, a plan, a canal, Panama! - I expect that will be
the longest palindrome found in this text.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Integer volutpat lorem imperdiet ante bibendum ullamcorper. Mauris
tempor hendrerit justo at elementum. Vivamus elit magna, accumsan id
condimentum a, luctus a ipsum. Donec fermentum, lectus at posuere
ullamcorper, mauris lectus tincidunt nulla, ut placerat justo odio sed
odio. Nulla blandit lorem sit amet odio varius nec vestibulum ante
ornare. Aliquam feugiat, velit a rhoncus rutrum, turpis metus pretium
dolor, et mattis leo turpis non est. Sed aliquet, sapien quis
consequat condimentum, sem magna ornare ligula, id blandit odio nisl
vitae erat. Nam vulputate tincidunt quam, non lacinia risus tincidunt
lacinia. Aenean fermentum tristique porttitor. Nam id dolor a eros
accumsan imperdiet. Aliquam quis nibh et dui ultricies cursus. Nunc
et ante non sapien vehicula rutrum. Duis posuere dictum blandit. Nunc
vitae tempus purus.
END
def clean(str)
str.gsub(/[^A-Za-z]/,'').downcase
end
class String
def palindrome?
self == self.reverse
end
def longest_palindrome_at_end
first_possible_start = 0
end_match_regexp = /#{self[-1]}/
palindrome = nil
while (palindrome.nil? && (candidate_start = self.match(end_match_regexp, first_possible_start)))
candidate_index = candidate_start.begin(0)
candidate = self[candidate_index..-1]
if candidate.palindrome?
palindrome = candidate
else
first_possible_start = candidate_index + 1
end
end
palindrome || ""
end
def longest_palindrome
longest = ""
self.size.downto(1) do |last|
break if last < longest.size
candidate = self[0,last].longest_palindrome_at_end
longest = candidate if candidate.size > longest.size
end
longest
end
end
require 'benchmark'
double = TEXT + TEXT
puts Benchmark.measure {puts clean(TEXT).longest_palindrome}
puts Benchmark.measure {puts clean(double).longest_palindrome}
Now my MacBook Pro is a little newer than Brian's, it's a late 2009 model, with a 2.66 GHz Core 2 Duo. But my version is 36x faster (.11 vs. 4 seconds) than Brian's Haskell program, and 891x faster than his Ruby program. Hence the title of this article!
Original article writen by Rick DeNatale and published on Talk Like A Duck | direct link to this article | If you are reading this article elsewhere than Talk Like A Duck, it has been illegally reproduced and without proper authorization.