lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Remi Mikalsen <remi.mikal...@iktsenteret.no>
Subject Re: NorwegianLightStemFilterFactory and protected words
Date Fri, 01 Mar 2013 16:28:16 GMT
Worked! Thanks :)

R

----- Opprinnelig melding -----
> 
> Of course! My excuse is called Friday afternoon :) Will test when I'm
> in front of a computer :)
> 
> Thanks!
> 
> Remi
> 
> Sendt fra min HTC
> 
> ----- Reply message -----
> Fra: "Ahmet Arslan" <iorixxx@yahoo.com>
> Til: <solr-user@lucene.apache.org>
> Emne: NorwegianLightStemFilterFactory and protected words
> Dato: fre., mars 1, 2013 15:50
> 
> 
> 
> 
> Hi Remi,
> 
> You need to use *Factory class.
> 
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protectedkeyword.txt" ignoreCase="false"/>
> 
> Ahmet
> 
> --- On Fri, 3/1/13, Remi Mikalsen <remi.mikalsen@iktsenteret.no>
> wrote:
> 
> > From: Remi Mikalsen <remi.mikalsen@iktsenteret.no>
> > Subject: Re: NorwegianLightStemFilterFactory and protected words
> > To: solr-user@lucene.apache.org
> > Date: Friday, March 1, 2013, 4:38 PM
> > Thanks for such a quick response!
> > 
> > I tried out the suggestion, but I'm struggeling with
> > actually making it work:
> > 
> > schema.xml:
> >  <filter
> > class="org.apache.lucene.analysis.KeywordMarkerFilter"
> > protected="protectedkeywords.txt" ignoreCase="false"/>
> > 
> > Produces an instantiation error:
> >  SEVERE: org.apache.solr.common.SolrException: Error
> > instantiating class:
> > 'org.apache.lucene.analysis.KeywordMarkerFilter
> >  ...
> >  Caused by: java.lang.InstantiationException:
> > org.apache.lucene.analysis.KeywordMarkerFilter
> > 
> > I'm running Solr 3.6.1, and went looking here for more
> > info:
> >  http://lucene.apache.org/solr/api-3_6_1/org/apache/solr/analysis/KeywordMarkerFilterFactory.html
> > 
> > The protectedkeywords.txt has one line, is world readable,
> > placed in same dir as protwords.txt and contains:
> > lærer
> > 
> > Any ideas on what is wrong?
> > 
> > Regards,
> > Remi Mikalsen
> > 
> > 
> > ----- Opprinnelig melding -----
> > > Hi Remi,
> > > 
> > > The filter does not support protwords but does support
> > the
> > > KeywordAttribute. Use the KeywordMarkerFilter to mark a
> > list of
> > > words and protect them from stemming.
> > > 
> > > http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordMarkerFilter.html
> > > 
> > > Cheers,
> > > Markus
> > > 
> > >  
> > >  
> > > -----Original message-----
> > > > From:Remi Mikalsen <remi.mikalsen@iktsenteret.no>
> > > > Sent: Fri 01-Mar-2013 14:46
> > > > To: solr-user@lucene.apache.org
> > > > Subject: NorwegianLightStemFilterFactory and
> > protected words
> > > > 
> > > > While the NorwegianLightStemFilterFactory
> > generally works very
> > > > well, I have come across a few words I'd very much
> > like not to
> > > > stem.
> > > > 
> > > > The following words:
> > > >  - lærere (teachers)
> > > >  - lærer (teacher)
> > > >  - lære (teach)
> > > > 
> > > > all match :
> > > >  - lær (leather)
> > > > 
> > > > I tried adding protected="protwords.txt" to my
> > > > NorwegianLightStemFilterFactory filter, and adding
> > the following
> > > > words to my protwords.txt file:
> > > >  - lærere
> > > >  - lærer
> > > >  - lære
> > > > 
> > > > It didn't work (I use the protwords.txt for other
> > purposes and it
> > > > works there). After looking around, it *seems*
> > this particular
> > > > FilterFactory doesn't support protwords the same
> > way for example
> > > > SnowballPorterFilterFactory does.
> > > > 
> > > > I wonder if there is an alternative way to stop
> > those words from
> > > > being processed by the
> > NorwegianLightStemFilterFactory?
> > > > 
> > > > 
> > > > Regards,
> > > > 
> > > > --
> > > > Remi Mikalsen
> > > > Senter for IKT i utdanningen
> > > > 
> > > 
> >
> 

Mime
View raw message