lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: synonyms and term position
Date Wed, 09 Oct 2013 08:48:57 GMT
Does "two" has a synonym of "in" and "one"?


2013/10/9 Furkan KAMACI <furkankamaci@gmail.com>

> Does "two" has a synonym of "in" and "one"?
>
>
> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>
>> Sure,
>>
>> Find attached the screenshots with almost all the analysis, (dont worry
>> about the lowercase and the porter stemmer)
>>
>> Regards.
>>
>>
>>
>>
>> On Wed, Oct 9, 2013 at 10:17 AM, Furkan KAMACI <furkankamaci@gmail.com>wrote:
>>
>>> Could you send screenshot of  admin Analysis page when trying to analyze
>>> that words?
>>>
>>>
>>> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>>>
>>> > Hi:
>>> >
>>> > I'm involved in a process o upgrade solr from 1.4 to 4.4 and I'm
>>> having a
>>> > problem using SynonymFilterFactory within the process chain
>>> > SynonymFilterFactory, StopFilterFactory .
>>> >
>>> > I have configured synonyms.txt to expand the word AIO as: all-in-one.
>>> Well,
>>> > when using solr 1.4 I get the following result (term position) when
>>> > analysing the string "one aio two".
>>> >
>>> > Solr 1.4 after synonym:
>>> >
>>> > term position |1 | 2 |3 |4 |5
>>> > term text |one| all |in |one |two
>>> >
>>> > Solr 1.4 after stopfilter ("in" term is deleted and terms "all" and
>>> "one"
>>> > are consecutive)
>>> >
>>> > term position |1 | 2 |4 |5
>>> > term text |one| all |one |two
>>> >
>>> >
>>> >
>>> > But when using solr4.4 I get:
>>> >
>>> > Solr 4.4 after synonym:
>>> >
>>> > term position |1 | 2 |3 |4 |3
>>> > term text |one| all |in |one |two
>>> >
>>> > Solr 4.4 after stop ("in" is deleted and the term "two" is now close to
>>> > "all" :
>>> >
>>> > term position |1 | 2 |4 |3
>>> > term text |one| all |one |two
>>> >
>>> >
>>> >
>>> > The problem is that the second word "two" is in position 3 in solr4.4
>>> so
>>> > when I try to search aio, in solr1.4 I get results, but find nothing
>>> using
>>> > Solr4. Is there any option to configure solr4 that imitates solr1.4
>>> > behavior.
>>> >
>>> >
>>> > Regards.
>>> >
>>> >
>>> >
>>> >
>>> > Please, find attached the fieldtype configuration.
>>> >
>>> > <fieldType name="text" class="solr.TextField"
>>> positionIncrementGap="100"
>>> > autoGeneratePhraseQueries="true">
>>> > <analyzer type="index">
>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>> > ignoreCase="true" expand="true" />
>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>> > generateNumberParts="1" catenateWords="1"
>>> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
>>> > <filter class="solr.LowerCaseFilterFactory" />
>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>> > protected="protwords.txt" />
>>> > </analyzer>
>>> > <analyzer type="query">
>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>> > ignoreCase="true" expand="true" />
>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>> > generateNumberParts="1" catenateWords="0"
>>> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" />
>>> > <filter class="solr.LowerCaseFilterFactory" />
>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>> > protected="protwords.txt" />
>>> > </analyzer>
>>> > </fieldType>
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message