lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Stemmer bug?
Date Tue, 10 Jul 2007 21:26:30 GMT

: Subject: Stemmer bug?

can you elaborate on what exactly you view as a bug?

if the issue is just that one of the examples stemms something in a way
thta you think makes sense, but the other one does not that really isn't a
bug so much as it is a comment on the effectiveness of the Snowball
Stemmer for Russian vs the RussianStemmer class used by the
RussianAnalzer.  if you like the stemming that comes out of hte
RussianAnalyzer you can use the RussianStemFilter yourslf by creating a
simple FilterFactory arround it (there are lots of examples in teh Solr
code base)

Also keep in mind that the Snowball Stemmer is not designed to produce
"real" words when it stems ... it's an algorithmic stemmer designed to
produce artificial stems for common cases ... so if you think it's a bug
because it produces terms that aren't real words -- it's not, that's just
the way it works -- what matters is that it produces the same artificaial
stem for related words.



-Hoss


Mime
View raw message