lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <>
Subject Re: Sorting non-english text
Date Thu, 25 Aug 2016 14:29:58 GMT
Hi Vasu,

There is a field type or something like that (CollationKeyAnalyzer) for language specific


On Thursday, August 25, 2016 12:29 PM, Vasu Y <> wrote:
I have a text field which can contain values (multiple tokens) in English;
to support sorting, I had <copyField> in schema.xml to copy this to a new
field of type "lowercase" (defined as below).
I also have text fields of type text_de, text_es, text_fr, ja, cn etc. I
intend to do <copyField> to copy them to a new field of type "lowercase" to
support sorting.

Would this "lowercase" field type work well for sorting non-English fields
that are non-tokenized (or are single-term) or do you suggest to use a
different tokenizer & filter?

     <!-- lowercases the entire field value, keeping it as a single token.
     <fieldType name="lowercase" class="solr.TextField"
         <tokenizer class="solr.KeywordTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory" />


View raw message