lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: Sorting non-english text
Date Thu, 25 Aug 2016 14:29:58 GMT
Hi Vasu,

There is a field type or something like that (CollationKeyAnalyzer) for language specific
sorting.

Ahmet



On Thursday, August 25, 2016 12:29 PM, Vasu Y <vyal2k@gmail.com> wrote:
Hi,
I have a text field which can contain values (multiple tokens) in English;
to support sorting, I had <copyField> in schema.xml to copy this to a new
field of type "lowercase" (defined as below).
I also have text fields of type text_de, text_es, text_fr, ja, cn etc. I
intend to do <copyField> to copy them to a new field of type "lowercase" to
support sorting.

Would this "lowercase" field type work well for sorting non-English fields
that are non-tokenized (or are single-term) or do you suggest to use a
different tokenizer & filter?

     <!-- lowercases the entire field value, keeping it as a single token.
-->
     <fieldType name="lowercase" class="solr.TextField"
positionIncrementGap="100">
       <analyzer>
         <tokenizer class="solr.KeywordTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory" />
       </analyzer>
    </fieldType>

Thanks,
Vasu

Mime
View raw message