lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Stromnov <strom...@gmail.com>
Subject Re: Problem with Russian stemmer in Solr 1.2
Date Mon, 09 Jul 2007 18:36:09 GMT

Hi, Daniel

Stemmer in RussianAnalyser works as expected. But this analyser doesn't
allow any Solr customization. All stopwords are hardcoded, no support for
custom tokenizer, no synonym support.

RussianAnalyser is similar to this scheme:
  standard tokenizer
  standard filter factory
  word delimeter filter factory 
  lowercase filter factory
  stop filter factory (with hardcoded stopwords)
  russian stem filter
 

Regards,
Andrew


Daniel Alheiros wrote:
> 
> Hi Andrew
> 
> In fact I did it creating all the Factories for Solr, but I think you can
> use it directly, changing your index like this:
> 
> <fieldtype name="cpstext_russian" class="solr.TextField"
> positionIncrementGap="100">
>         <analyzer type="index"
> class=”org.apache.lucene.analysis.ru.RussianAnalyzer”>
>         </analyzer>
>         <analyzer type="query"
> class=”org.apache.lucene.analysis.ru.RussianAnalyzer”>
>         </analyzer>
> </fieldtype>
> 
> I’ve not tested that, but I saw something like this.
> 
> Please tell me if it works as expected and if it solves your problem (I’m
> indexing Russian content and as you seem to be knowledgeable of Russian
> language your comments are very useful).
> 
> Regards,
> Daniel
> 

-- 
View this message in context: http://www.nabble.com/Problem-with-Russian-stemmer-in-Solr-1.2-tf4049948.html#a11507263
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message