lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: synonyms and term position
Date Wed, 09 Oct 2013 08:17:10 GMT
Could you send screenshot of  admin Analysis page when trying to analyze
that words?


2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>

> Hi:
>
> I'm involved in a process o upgrade solr from 1.4 to 4.4 and I'm having a
> problem using SynonymFilterFactory within the process chain
> SynonymFilterFactory, StopFilterFactory .
>
> I have configured synonyms.txt to expand the word AIO as: all-in-one. Well,
> when using solr 1.4 I get the following result (term position) when
> analysing the string "one aio two".
>
> Solr 1.4 after synonym:
>
> term position |1 | 2 |3 |4 |5
> term text |one| all |in |one |two
>
> Solr 1.4 after stopfilter ("in" term is deleted and terms "all" and "one"
> are consecutive)
>
> term position |1 | 2 |4 |5
> term text |one| all |one |two
>
>
>
> But when using solr4.4 I get:
>
> Solr 4.4 after synonym:
>
> term position |1 | 2 |3 |4 |3
> term text |one| all |in |one |two
>
> Solr 4.4 after stop ("in" is deleted and the term "two" is now close to
> "all" :
>
> term position |1 | 2 |4 |3
> term text |one| all |one |two
>
>
>
> The problem is that the second word "two" is in position 3 in solr4.4 so
> when I try to search aio, in solr1.4 I get results, but find nothing using
> Solr4. Is there any option to configure solr4 that imitates solr1.4
> behavior.
>
>
> Regards.
>
>
>
>
> Please, find attached the fieldtype configuration.
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true" />
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt" />
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true" />
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt" />
> </analyzer>
> </fieldType>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message