lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasu Y <vya...@gmail.com>
Subject Sorting non-english text
Date Thu, 25 Aug 2016 09:29:28 GMT
Hi,
 I have a text field which can contain values (multiple tokens) in English;
to support sorting, I had <copyField> in schema.xml to copy this to a new
field of type "lowercase" (defined as below).
I also have text fields of type text_de, text_es, text_fr, ja, cn etc. I
intend to do <copyField> to copy them to a new field of type "lowercase" to
support sorting.

Would this "lowercase" field type work well for sorting non-English fields
that are non-tokenized (or are single-term) or do you suggest to use a
different tokenizer & filter?

     <!-- lowercases the entire field value, keeping it as a single token.
 -->
     <fieldType name="lowercase" class="solr.TextField"
positionIncrementGap="100">
       <analyzer>
         <tokenizer class="solr.KeywordTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory" />
       </analyzer>
    </fieldType>

Thanks,
Vasu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message