lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steinar Asbjørnsen <steinar...@gmail.com>
Subject Re: Problem with words thats amost similar
Date Fri, 18 Dec 2009 11:29:07 GMT
Den 17. des. 2009 kl. 13.48 skrev Shalin Shekhar Mangar:

> 2009/12/17 Steinar Asbjørnsen <steinarasb@gmail.com>
> 
>> Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar:
>> 
>>>> 
>>>> 
>>> For specific cases like this, you can add the word to a file and specify
>> it
>>> in schema, for example:
>>> 
>>> <filter class="solr.SnowballPorterFilterFactory" language="English"
>>> protected="protwords.txt"/>
>> 
>> Ty Shalin.
>> 
>> This is my schema.xml file
>> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>>     <analyzer type="index">
>>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>       <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" enablePositionIncrements="true"/>
>>       <filter class="solr.WordDelimiterFilterFactory"
>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>>       <filter class="solr.LowerCaseFilterFactory"/>
>>       <filter class="solr.EnglishPorterFilterFactory"
>> protected="protwords.txt"/>
>>       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>     </analyzer>
>>     <analyzer type="query">
>>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>       <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> ignoreCase="true" expand="true"/>
>>       <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt"/>
>>       <filter class="solr.WordDelimiterFilterFactory"
>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>>       <filter class="solr.LowerCaseFilterFactory"/>
>>       <filter class="solr.EnglishPorterFilterFactory"
>> protected="protwords.txt"/>
>>       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>     </analyzer>
>>   </fieldType>
>> 
>> I added restaurant and restaurering to protwords.txt, restarted Tomcat, but
>> no dice.
>> Do I need to use the SnowballPorterFilterFactory?
>> And do I need to reindex the documents?
>> 
>> 
> Actually EnglishPorterFilterFactory is the same as
> SnowballPorterFilterFactory with language="English". Both will work. You
> will need to re-index the documents.

What I've done so far is to add both restaurant and restaurering to protwords.txt.
I've also refeed a single document (with the keyword "restaurering") to check that it no longer
appears in a search result for "restaurant".
Do i have to refeed every document in the index?
Or restart so that solr re-reads the protwords.txt-file (this is on a different installation(prod)
then the one i restarted earlier(dev))?

Steinar
Mime
View raw message