lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Aglassinger <>
Subject Re: Questions for SynonymGraphFilter and WordDelimiterGraphFilter
Date Mon, 07 Jan 2019 15:56:18 GMT
Hi Wei,

here's a fairly simple field type we currently use in a project that seems to do the job with
graph synonyms. Maybe this helps as a starting point for you:

        <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
                <tokenizer class="solr.WhitespaceTokenizerFactory" />
                <filter class="solr.ManagedSynonymGraphFilterFactory" managed="de" />
                <filter class="solr.ManagedStopFilterFactory" managed="de" />
                <filter class="solr.WordDelimiterGraphFilterFactory"  preserveOriginal="1"
                        generateWordParts="1" generateNumberParts="1" catenateWords="1"
                        catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
                <filter class="solr.LowerCaseFilterFactory" />
                <filter class="solr.ASCIIFoldingFilterFactory" />
                <filter class="solr.GermanStemFilterFactory" />
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

As you can see we use the same filters for both indexing and query, so this might have some
impact on positional queries but so far it seems negligible for the short synonyms we use
in practice. Also there is no need for the FlattenGraphFilter.

The WhitespaceTokenizerFactory ensures that you can define synonyms with hyphens like mac-book
-> macbook.

Best regards, Thomas.

´╗┐On 05.01.19, 02:11, "Wei" <> wrote:

    We are upgrading to Solr 7.6.0 and noticed that SynonymFilter and
    WordDelimiterFilter have been deprecated. Solr doc recommends to use
    SynonymGraphFilter and WordDelimiterGraphFilter instead 
    I guess the StopFilter mess up the SynonymGraphFilter output? Not sure
    if  it's a solr defect or there is a guideline that StopFilter should
    not be put after graph filters.
    Thanks in advance for you input.

View raw message