lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chitra <>
Subject Re: Accent insensitive search for greek characters
Date Tue, 24 Oct 2017 12:16:58 GMT
                   ICUTransformFilter is working fine for greek characters
alone as per requirement. but one case it's breaking( σ & ς are the lower
forms of Σ Sigma).


I indexed the terms πελάτης (indexed as πελατης) & πελάτηΣ (indexed
πελατης).I get the expected search results if I perform the search for
πελάτηΣ (or) πελάτης (or) any combinations of upper case & lower case Greek
characters. But if I search as πελατησ I won't get any search results.

In Greek, σ & ς are the lower forms of Σ Sigma. And this case is solved in

Is ICU Transliterator rule formed right? Kindly look at the below code

TokenStream tok = new ICUTransformFilter(tok,
> Lower; NFD; [:Nonspacing Mark:] Remove; NFC;"));

Kindly help me to resolve this.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message