lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2102) LowerCaseFilter for Turkish language
Date Tue, 01 Dec 2009 20:41:20 GMT


Robert Muir commented on LUCENE-2102:

Hi Ahmet, this patch is looking very nice, thank you!

I have some minor suggestions:
* can we use hex notation (maybe also constants too) for the special case?
* you can use assertTokenStreamContents here (it is in the base test case) to simplify your
test, it works like assertAnalyzesTo but on tokenstream

I will let others comment on where this belongs (maybe contrib?)
Wherever it is, I would like to use it in snowball contrib also.

> LowerCaseFilter for Turkish language
> ------------------------------------
>                 Key: LUCENE-2102
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 3.0
>            Reporter: Ahmet Arslan
>            Priority: Minor
>         Attachments: LUCENE-2102.patch
> java.lang.Character.toLowerCase() converts 'I' to 'i' however in Turkish alphabet lowercase
of 'I' is not 'i'. It is LATIN SMALL LETTER DOTLESS I.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message