This post originated from an RSS feed registered with Python Buzz
by Ng Pheng Siong.
Original Post: Stemming
Feed Title: (render-blog Ng Pheng Siong)
Feed URL: http://sandbox.rulemaker.net/ngps/rdf10_xml
Feed Description: Just another this here thing blog.
The Porter stemming algorithm "is a process for removing the
commoner morphological and inflexional endings from words in
English. Its main use is as part of a term normalisation process that
is usually done when setting up information retrieval systems."
Martin Porter's official home
page for his algorithm lists implementations in C, Java, Perl,
Python, C#, .NET, Common Lisp, Ruby, VB, PHP, Delphi and Javascript.
That page also contains a link to Snowball, his string
processing language designed for creating stemming
algorithms. Lucene's implementation of Snowball can be found here.