lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2102) LowerCaseFilter for Turkish language
Date Tue, 01 Dec 2009 22:23:20 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784476#action_12784476
] 

Robert Muir commented on LUCENE-2102:
-------------------------------------

bq. One othe possibility to resolve the problem in a completely different way: You could wrap
a MappingCharFilter on top of the input reader in Analyzer and just add a replacement for
this one char:

Uwe, but this is inflexible. If we want to make this filter support turkish lowercasing in
the future for all of unicode, not just NFC composed form, we cannot do it with MappingCharFilter.
Again I don't think we should fix this now, but in the future I think we might want to. 

> LowerCaseFilter for Turkish language
> ------------------------------------
>
>                 Key: LUCENE-2102
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2102
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 3.0
>            Reporter: Ahmet Arslan
>            Assignee: Robert Muir
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch
>
>
> java.lang.Character.toLowerCase() converts 'I' to 'i' however in Turkish alphabet lowercase
of 'I' is not 'i'. It is LATIN SMALL LETTER DOTLESS I.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message