This post originated from an RSS feed registered with Web Buzz
by Chee Seng Chua.
Original Post: Apache Derby to Handle Case Insensitive Search
Feed Title: Chee Seng Tech Blog
Feed URL: http://chuacheeseng.blogspot.com/atom.xml
Feed Description: When you think you are superior, you are being an idiot...
I have been using Apache Derby which can be bundled easily with the 'Lite' version of the product that I am developing in Flex/Java. The fact that by default Derby gives different result set for "where name='chua'" and "where name='CHUA'" has been an egg on my face. Using UPPER/LOWER with function index doesn't sounds good solution to me, neither to my boss. Derby folks are actually aware of the importance to support this, and several proposals have been made in the mailing list, but none have been implemented yet.
However, I am lucky enough to find a solution that I like with Collator here:-
The approach is to create a custom collator provider which returns collator with strength of SECONDARY, which gives case-insensitive comparison for EQUAL and LIKE.
Copied from the blog above:-
"
Create a class that extends java.text.spi.CollatorProvider and returns a collator that orders strings the way you want it to
Create a text file named META-INF/services/java.text.spi.CollatorProvider which contains one line with the name of your collator provider class
Put the compiled class file and the text file in a jar file which you drop into your JRE's lib/ext directory or in one of the directories specified by the java.ext.dirs property
"
And the sample custom collator provider:-
"
public class MyCollatorProvider extends CollatorProvider { public Locale[] getAvailableLocales() { return new Locale[] { new Locale("en", "US", "caseinsensitive") }; } public Collator getInstance(Locale locale) { Collator c = Collator.getInstance(Locale.US); // Ignore tertiary differences (case differences) c.setStrength(Collator.SECONDARY); return c; } } "
Using territory based collation disables query optimization when doing LIKE query, I don't know the reason behind it. In the application that I am building, the user is allowed to use every single column to search and filter data. But, not every single column is indexed. ;-)