lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: NorwegianLightStemFilterFactory and protected words
Date Fri, 01 Mar 2013 14:49:40 GMT
Hi Remi,

You need to use *Factory class.

<filter class="solr.KeywordMarkerFilterFactory" protected="protectedkeyword.txt" ignoreCase="false"/>

Ahmet

--- On Fri, 3/1/13, Remi Mikalsen <remi.mikalsen@iktsenteret.no> wrote:

> From: Remi Mikalsen <remi.mikalsen@iktsenteret.no>
> Subject: Re: NorwegianLightStemFilterFactory and protected words
> To: solr-user@lucene.apache.org
> Date: Friday, March 1, 2013, 4:38 PM
> Thanks for such a quick response!
> 
> I tried out the suggestion, but I'm struggeling with
> actually making it work:
> 
> schema.xml:
>  <filter
> class="org.apache.lucene.analysis.KeywordMarkerFilter"
> protected="protectedkeywords.txt" ignoreCase="false"/>
> 
> Produces an instantiation error:
>  SEVERE: org.apache.solr.common.SolrException: Error
> instantiating class:
> 'org.apache.lucene.analysis.KeywordMarkerFilter
>  ...
>  Caused by: java.lang.InstantiationException:
> org.apache.lucene.analysis.KeywordMarkerFilter
> 
> I'm running Solr 3.6.1, and went looking here for more
> info:
>  http://lucene.apache.org/solr/api-3_6_1/org/apache/solr/analysis/KeywordMarkerFilterFactory.html
> 
> The protectedkeywords.txt has one line, is world readable,
> placed in same dir as protwords.txt and contains:
> lærer
> 
> Any ideas on what is wrong?
> 
> Regards,
> Remi Mikalsen
> 
> 
> ----- Opprinnelig melding -----
> > Hi Remi,
> > 
> > The filter does not support protwords but does support
> the
> > KeywordAttribute. Use the KeywordMarkerFilter to mark a
> list of
> > words and protect them from stemming.
> > 
> > http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordMarkerFilter.html
> > 
> > Cheers,
> > Markus
> > 
> >  
> >  
> > -----Original message-----
> > > From:Remi Mikalsen <remi.mikalsen@iktsenteret.no>
> > > Sent: Fri 01-Mar-2013 14:46
> > > To: solr-user@lucene.apache.org
> > > Subject: NorwegianLightStemFilterFactory and
> protected words
> > > 
> > > While the NorwegianLightStemFilterFactory
> generally works very
> > > well, I have come across a few words I'd very much
> like not to
> > > stem.
> > > 
> > > The following words:
> > >  - lærere (teachers)
> > >  - lærer (teacher)
> > >  - lære (teach)
> > > 
> > > all match :
> > >  - lær (leather)
> > > 
> > > I tried adding protected="protwords.txt" to my
> > > NorwegianLightStemFilterFactory filter, and adding
> the following
> > > words to my protwords.txt file:
> > >  - lærere
> > >  - lærer
> > >  - lære
> > > 
> > > It didn't work (I use the protwords.txt for other
> purposes and it
> > > works there). After looking around, it *seems*
> this particular
> > > FilterFactory doesn't support protwords the same
> way for example
> > > SnowballPorterFilterFactory does.
> > > 
> > > I wonder if there is an alternative way to stop
> those words from
> > > being processed by the
> NorwegianLightStemFilterFactory?
> > > 
> > > 
> > > Regards,
> > > 
> > > --
> > > Remi Mikalsen
> > > Senter for IKT i utdanningen
> > > 
> > 
> 

Mime
View raw message