lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alvaro Cabrerizo <topor...@gmail.com>
Subject Re: synonyms and term position
Date Wed, 09 Oct 2013 09:05:07 GMT
No, it has no synonyms.


On Wed, Oct 9, 2013 at 10:48 AM, Furkan KAMACI <furkankamaci@gmail.com>wrote:

> Does "two" has a synonym of "in" and "one"?
>
>
> 2013/10/9 Furkan KAMACI <furkankamaci@gmail.com>
>
>> Does "two" has a synonym of "in" and "one"?
>>
>>
>> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>>
>>> Sure,
>>>
>>> Find attached the screenshots with almost all the analysis, (dont worry
>>> about the lowercase and the porter stemmer)
>>>
>>> Regards.
>>>
>>>
>>>
>>>
>>> On Wed, Oct 9, 2013 at 10:17 AM, Furkan KAMACI <furkankamaci@gmail.com>wrote:
>>>
>>>> Could you send screenshot of  admin Analysis page when trying to analyze
>>>> that words?
>>>>
>>>>
>>>> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>>>>
>>>> > Hi:
>>>> >
>>>> > I'm involved in a process o upgrade solr from 1.4 to 4.4 and I'm
>>>> having a
>>>> > problem using SynonymFilterFactory within the process chain
>>>> > SynonymFilterFactory, StopFilterFactory .
>>>> >
>>>> > I have configured synonyms.txt to expand the word AIO as: all-in-one.
>>>> Well,
>>>> > when using solr 1.4 I get the following result (term position) when
>>>> > analysing the string "one aio two".
>>>> >
>>>> > Solr 1.4 after synonym:
>>>> >
>>>> > term position |1 | 2 |3 |4 |5
>>>> > term text |one| all |in |one |two
>>>> >
>>>> > Solr 1.4 after stopfilter ("in" term is deleted and terms "all" and
>>>> "one"
>>>> > are consecutive)
>>>> >
>>>> > term position |1 | 2 |4 |5
>>>> > term text |one| all |one |two
>>>> >
>>>> >
>>>> >
>>>> > But when using solr4.4 I get:
>>>> >
>>>> > Solr 4.4 after synonym:
>>>> >
>>>> > term position |1 | 2 |3 |4 |3
>>>> > term text |one| all |in |one |two
>>>> >
>>>> > Solr 4.4 after stop ("in" is deleted and the term "two" is now close
>>>> to
>>>> > "all" :
>>>> >
>>>> > term position |1 | 2 |4 |3
>>>> > term text |one| all |one |two
>>>> >
>>>> >
>>>> >
>>>> > The problem is that the second word "two" is in position 3 in solr4.4
>>>> so
>>>> > when I try to search aio, in solr1.4 I get results, but find nothing
>>>> using
>>>> > Solr4. Is there any option to configure solr4 that imitates solr1.4
>>>> > behavior.
>>>> >
>>>> >
>>>> > Regards.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Please, find attached the fieldtype configuration.
>>>> >
>>>> > <fieldType name="text" class="solr.TextField"
>>>> positionIncrementGap="100"
>>>> > autoGeneratePhraseQueries="true">
>>>> > <analyzer type="index">
>>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>>> > ignoreCase="true" expand="true" />
>>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>>> > generateNumberParts="1" catenateWords="1"
>>>> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
>>>> > <filter class="solr.LowerCaseFilterFactory" />
>>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>>> > protected="protwords.txt" />
>>>> > </analyzer>
>>>> > <analyzer type="query">
>>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>>> > ignoreCase="true" expand="true" />
>>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>>> > generateNumberParts="1" catenateWords="0"
>>>> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" />
>>>> > <filter class="solr.LowerCaseFilterFactory" />
>>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>>> > protected="protwords.txt" />
>>>> > </analyzer>
>>>> > </fieldType>
>>>> >
>>>>
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message