lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alvaro Cabrerizo <topor...@gmail.com>
Subject Re: synonyms and term position
Date Wed, 09 Oct 2013 09:12:29 GMT
The synonyms.txt has defined the next associations defined.

AIO=>All in one
aio=>all-in-one

Regards.


On Wed, Oct 9, 2013 at 11:05 AM, Alvaro Cabrerizo <toporniz@gmail.com>wrote:

> No, it has no synonyms.
>
>
> On Wed, Oct 9, 2013 at 10:48 AM, Furkan KAMACI <furkankamaci@gmail.com>wrote:
>
>> Does "two" has a synonym of "in" and "one"?
>>
>>
>> 2013/10/9 Furkan KAMACI <furkankamaci@gmail.com>
>>
>>> Does "two" has a synonym of "in" and "one"?
>>>
>>>
>>> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>>>
>>>> Sure,
>>>>
>>>> Find attached the screenshots with almost all the analysis, (dont worry
>>>> about the lowercase and the porter stemmer)
>>>>
>>>> Regards.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Oct 9, 2013 at 10:17 AM, Furkan KAMACI <furkankamaci@gmail.com>wrote:
>>>>
>>>>> Could you send screenshot of  admin Analysis page when trying to
>>>>> analyze
>>>>> that words?
>>>>>
>>>>>
>>>>> 2013/10/9 Alvaro Cabrerizo <toporniz@gmail.com>
>>>>>
>>>>> > Hi:
>>>>> >
>>>>> > I'm involved in a process o upgrade solr from 1.4 to 4.4 and I'm
>>>>> having a
>>>>> > problem using SynonymFilterFactory within the process chain
>>>>> > SynonymFilterFactory, StopFilterFactory .
>>>>> >
>>>>> > I have configured synonyms.txt to expand the word AIO as:
>>>>> all-in-one. Well,
>>>>> > when using solr 1.4 I get the following result (term position) when
>>>>> > analysing the string "one aio two".
>>>>> >
>>>>> > Solr 1.4 after synonym:
>>>>> >
>>>>> > term position |1 | 2 |3 |4 |5
>>>>> > term text |one| all |in |one |two
>>>>> >
>>>>> > Solr 1.4 after stopfilter ("in" term is deleted and terms "all"
and
>>>>> "one"
>>>>> > are consecutive)
>>>>> >
>>>>> > term position |1 | 2 |4 |5
>>>>> > term text |one| all |one |two
>>>>> >
>>>>> >
>>>>> >
>>>>> > But when using solr4.4 I get:
>>>>> >
>>>>> > Solr 4.4 after synonym:
>>>>> >
>>>>> > term position |1 | 2 |3 |4 |3
>>>>> > term text |one| all |in |one |two
>>>>> >
>>>>> > Solr 4.4 after stop ("in" is deleted and the term "two" is now close
>>>>> to
>>>>> > "all" :
>>>>> >
>>>>> > term position |1 | 2 |4 |3
>>>>> > term text |one| all |one |two
>>>>> >
>>>>> >
>>>>> >
>>>>> > The problem is that the second word "two" is in position 3 in
>>>>> solr4.4 so
>>>>> > when I try to search aio, in solr1.4 I get results, but find nothing
>>>>> using
>>>>> > Solr4. Is there any option to configure solr4 that imitates solr1.4
>>>>> > behavior.
>>>>> >
>>>>> >
>>>>> > Regards.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Please, find attached the fieldtype configuration.
>>>>> >
>>>>> > <fieldType name="text" class="solr.TextField"
>>>>> positionIncrementGap="100"
>>>>> > autoGeneratePhraseQueries="true">
>>>>> > <analyzer type="index">
>>>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>>>> > ignoreCase="true" expand="true" />
>>>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>>>> > generateNumberParts="1" catenateWords="1"
>>>>> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
>>>>> > <filter class="solr.LowerCaseFilterFactory" />
>>>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>>>> > protected="protwords.txt" />
>>>>> > </analyzer>
>>>>> > <analyzer type="query">
>>>>> > <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>>>> > ignoreCase="true" expand="true" />
>>>>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>> > words="stopwords.txt" enablePositionIncrements="true" />
>>>>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
>>>>> > generateNumberParts="1" catenateWords="0"
>>>>> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" />
>>>>> > <filter class="solr.LowerCaseFilterFactory" />
>>>>> > <filter class="solr.SnowballPorterFilterFactory" language="English"
>>>>> > protected="protwords.txt" />
>>>>> > </analyzer>
>>>>> > </fieldType>
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message