lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Modassar Ather <modather1...@gmail.com>
Subject Re: Preserve Original Option In Stemming (EnglishMinimalStemFilterFactory).
Date Tue, 25 Aug 2015 12:05:13 GMT
Can
anyone tell me why this option is not provided for Stemming.

I am not sure about it but the original token can be preserved by using
<filter class="solr.KeywordRepeatFilterFactory"/> too.
To avoid any duplicate token in the document <filter
class="solr.RemoveDuplicatesTokenFilterFactory"/> can be used at the end of
analysis chain.

Hope this helps.

Regards,
Modassar

On Tue, Aug 25, 2015 at 2:12 PM, Vishnu Mishra <vdilipm@gmail.com> wrote:

> Hi,
>
> I was working with Lucene 5.2 and trying to index some document. I am using
> EnglishMinimalStemFilterFactory and I found that there is no option for
> keeping the original text as wel as analyzed term into lucene index.
> WordDelimiterFilterFactory  provides preserveOriginal option to do this.
> Can
> anyone tell me why this option is not provided for Stemming. For e.g. if I
> want to store both *Methods* and *Method* in my index then I think there is
> no option is available in Lucene to do this.  I also noticed that if we
> place EnglishMinimalStemFilterFactory after WordDelimiterFilterFactory with
> option preserveOriginal ="1" then  it store both *Methods* and *Method*.
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Preserve-Original-Option-In-Stemming-EnglishMinimalStemFilterFactory-tp4225116.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message