lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: Doing Shingle but also keep special single word
Date Sun, 22 Aug 2010 15:24:40 GMT
> Isn't set outputUnigrams="true" will
> make index size about twice than when it's set to false?

Sure index will be bigger. I didn't know that this is problem for you. But if you have a list
of special single words that you want to keep, keepwordfilter can eliminate other tokens.
So index size will be okey.

> 
> Scott
> 
> ----- Original Message ----- From: "Ahmet Arslan" <iorixxx@yahoo.com>
> To: <solr-user@lucene.apache.org>
> Sent: Saturday, August 21, 2010 1:15 AM
> Subject: Re: Doing Shingle but also keep special single
> word
> 
> 
> >> I am building index with Shingle
> >> filter. We know it's minimum 2-gram but I also
> want keep
> >> some special single word, e.g. IBM, Microsoft,
> etc. i.e. I
> >> want to do a minimum 2-gram but also want to have
> these
> >> single word in my index, Is it possible?
> > 
> > outputUnigrams="true" parameter does not work for
> you?
> > 
> > After that you can cast <filter
> class="solr.KeepWordFilterFactory" words="keepwords.txt"
> ignoreCase="true"/> with keepwords.txt=IBM, Microsoft.
> > 
> > 
> > 
> > 
> 
> 


      

Mime
View raw message