lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei <weiwan...@gmail.com>
Subject Re: Questions for SynonymGraphFilter and WordDelimiterGraphFilter
Date Tue, 08 Jan 2019 21:56:31 GMT
bump..

On Mon, Jan 7, 2019 at 11:53 AM Wei <weiwang19@gmail.com> wrote:

> Thanks Thomas. You mentioned "Also there is no need for the
> FlattenGraphFilter", that's quite interesting because the Solr
> documentation says it's mandatory for indexing:
> https://lucene.apache.org/solr/guide/7_6/filter-descriptions.html. Is
> there any more explanation for this?
>
> Best regards,
> Wei
>
>
> On Mon, Jan 7, 2019 at 7:56 AM Thomas Aglassinger <
> t.aglassinger@netconomy.net> wrote:
>
>> Hi Wei,
>>
>> here's a fairly simple field type we currently use in a project that
>> seems to do the job with graph synonyms. Maybe this helps as a starting
>> point for you:
>>
>>         <fieldType name="text_de" class="solr.TextField"
>> positionIncrementGap="100">
>>             <analyzer>
>>                 <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>                 <filter class="solr.ManagedSynonymGraphFilterFactory"
>> managed="de" />
>>                 <filter class="solr.ManagedStopFilterFactory"
>> managed="de" />
>>                 <filter class="solr.WordDelimiterGraphFilterFactory"
>> preserveOriginal="1"
>>                         generateWordParts="1" generateNumberParts="1"
>> catenateWords="1"
>>                         catenateNumbers="1" catenateAll="0"
>> splitOnCaseChange="1" />
>>                 <filter class="solr.LowerCaseFilterFactory" />
>>                 <filter class="solr.ASCIIFoldingFilterFactory" />
>>                 <filter class="solr.GermanStemFilterFactory" />
>>                 <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>             </analyzer>
>>         </fieldType>
>>
>> As you can see we use the same filters for both indexing and query, so
>> this might have some impact on positional queries but so far it seems
>> negligible for the short synonyms we use in practice. Also there is no need
>> for the FlattenGraphFilter.
>>
>> The WhitespaceTokenizerFactory ensures that you can define synonyms with
>> hyphens like mac-book -> macbook.
>>
>> Best regards, Thomas.
>>
>>
>> ´╗┐On 05.01.19, 02:11, "Wei" <weiwang19@gmail.com> wrote:
>>
>>     Hello,
>>
>>     We are upgrading to Solr 7.6.0 and noticed that SynonymFilter and
>>     WordDelimiterFilter have been deprecated. Solr doc recommends to use
>>     SynonymGraphFilter and WordDelimiterGraphFilter instead
>>     I guess the StopFilter mess up the SynonymGraphFilter output? Not sure
>>     if  it's a solr defect or there is a guideline that StopFilter should
>>     not be put after graph filters.
>>
>>     Thanks in advance for you input.
>>
>>
>>     Thanks,
>>
>>     Wei
>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message