lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Preserve Original Option In Stemming (EnglishMinimalStemFilterFactory).
Date Tue, 25 Aug 2015 18:34:33 GMT
Hi,


> So the "usual" answer is either to use the KeywordRepeatFilterFactory, or
> use a copyField that doesn't stem and when exact matches are required,
> search on that field.

Or even better search on both fields (stemmed and unstemmed, I generally also have a ASCII-folded
one) with SHOULD. An exact match would get higher score (because it hits both closes, stemmed
and unstemmed field), while an only-stem match automatically gets a lower score (because only
one Boolean clause matches).

Best,
Uwe
 
> Best,
> Erick
> 
> On Tue, Aug 25, 2015 at 5:05 AM, Modassar Ather
> <modather1981@gmail.com> wrote:
> > Can
> > anyone tell me why this option is not provided for Stemming.
> >
> > I am not sure about it but the original token can be preserved by
> > using <filter class="solr.KeywordRepeatFilterFactory"/> too.
> > To avoid any duplicate token in the document <filter
> > class="solr.RemoveDuplicatesTokenFilterFactory"/> can be used at the
> > end of analysis chain.
> >
> > Hope this helps.
> >
> > Regards,
> > Modassar
> >
> > On Tue, Aug 25, 2015 at 2:12 PM, Vishnu Mishra <vdilipm@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> I was working with Lucene 5.2 and trying to index some document. I am
> >> using EnglishMinimalStemFilterFactory and I found that there is no
> >> option for keeping the original text as wel as analyzed term into lucene
> index.
> >> WordDelimiterFilterFactory  provides preserveOriginal option to do this.
> >> Can
> >> anyone tell me why this option is not provided for Stemming. For e.g.
> >> if I want to store both *Methods* and *Method* in my index then I
> >> think there is no option is available in Lucene to do this.  I also
> >> noticed that if we place EnglishMinimalStemFilterFactory after
> >> WordDelimiterFilterFactory with option preserveOriginal ="1" then  it
> store both *Methods* and *Method*.
> >>
> >>
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://lucene.472066.n3.nabble.com/Preserve-Original-Option-In-
> Stemmi
> >> ng-EnglishMinimalStemFilterFactory-tp4225116.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message