lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From darniz <rnizamud...@edmunds.com>
Subject Re: Question regarding synonym
Date Mon, 05 Oct 2009 16:46:15 GMT

yes that's what we decided to expand these terms while indexing.
if we have
bayrische motoren werke => bmw

and i have a document which has bmw in it, searching for text:bayrische does
not give me results. i have to give
text:"bayrische motoren werke" then it actually takes the synonym and gets
me the document.

Now if i change the synonym mapping to 
bayrische motoren werke , bmw with expand parameter to true and also use
this file at indexing.

now at the  time i index this document along with "bmw" i also index the
following words "bayrische" "motoren" "werke"

any text query like text:motoren or text:bayrische will give me results now.

Please correct me if my assumption is wrong.

Thanks
darniz









Christian Zambrano wrote:
> 
> 
> 
> On 10/02/2009 06:02 PM, darniz wrote:
>> Thanks
>> As i said it even works by giving double quotes too.
>> like carDescription:"austin martin"
>>
>> So is that the conclusion that in order to map two word synonym i have to
>> always enclose in double quotes, so that it doen not split the words
>>
>>
>>
>>    
> Yes, but there are things you need to keep in mind.
> 
>  From the solr wiki:
> 
> Keep in mind that while the SynonymFilter will happily work with 
> *synonyms* containing multiple words (ie: 
> "sea biscuit, sea biscit, seabiscuit") The recommended approach for 
> dealing with *synonyms* like this, is to expand the synonym when 
> indexing. This is because there are two potential issues that can arrise 
> at query time:
> 
>    1.
> 
>       The Lucene QueryParser tokenizes on white space before giving any
>       text to the Analyzer, so if a person searches for the words
>       sea biscit the analyzer will be given the words "sea" and "biscit"
>       seperately, and will not know that they match a synonym.
> 
>    2.
> 
>       Phrase searching (ie: "sea biscit") will cause the QueryParser to
>       pass the entire string to the analyzer, but if the SynonymFilter
>       is configured to expand the *synonyms*, then when the QueryParser
>       gets the resulting list of tokens back from the Analyzer, it will
>       construct a MultiPhraseQuery that will not have the desired
>       effect. This is because of the limited mechanism available for the
>       Analyzer to indicate that two terms occupy the same position:
>       there is no way to indicate that a "phrase" occupies the same
>       position as a term. For our example the resulting MultiPhraseQuery
>       would be "(sea | sea | seabiscuit) (biscuit | biscit)" which would
>       not match the simple case of "seabisuit" occuring in a document
> 
> 
>>
>>
>>
>>
>>
>>
>>
>> Christian Zambrano wrote:
>>    
>>> When you use a field qualifier(fieldName:valueToLookFor) it only applies
>>> to the word right after the semicolon. If you look at the debug
>>> infomation you will notice that for the second word it is using the
>>> default field.
>>>
>>> <str name="parsedquery_toString">carDescription:austin
>>> *text*:martin</str>
>>>
>>> the following should word:
>>>
>>> carDescription:(austin martin)
>>>
>>>
>>> On 10/02/2009 05:46 PM, darniz wrote:
>>>      
>>>> This is not working when i search documents i have a document which
>>>> contains
>>>> text aston martin
>>>>
>>>> when i search carDescription:"austin martin" i get a match but when i
>>>> dont
>>>> give double quotes
>>>>
>>>> like carDescription:austin martin
>>>> there is no match
>>>>
>>>> in the analyser if i give austin martin with out quotes, when it passes
>>>> through synonym filter it matches aston martin ,
>>>> may be by default analyser treats it as a phrase "austin martin" but
>>>> when
>>>> i
>>>> try to do a query by typing
>>>> carDescription:austin martin i get 0 documents. the following is the
>>>> debug
>>>> node info with debugQuery=on
>>>>
>>>> <str name="rawquerystring">carDescription:austin martin</str>
>>>> <str name="querystring">carDescription:austin martin</str>
>>>> <str name="parsedquery">carDescription:austin text:martin</str>
>>>> <str name="parsedquery_toString">carDescription:austin
>>>> text:martin</str>
>>>>
>>>> dont know why it breaks the word, may be its a desired behaviour
>>>> when i give carDescription:"austin martin" of course in this its able
>>>> to
>>>> map
>>>> to synonym and i get the desired result
>>>>
>>>> Any opinion
>>>>
>>>> darniz
>>>>
>>>>
>>>>
>>>> Ensdorf Ken wrote:
>>>>
>>>>        
>>>>>
>>>>>          
>>>>>> Hi
>>>>>> i have a question regarding synonymfilter
>>>>>> i have a one way mapping defined
>>>>>> austin martin, astonmartin =>   aston martin
>>>>>>
>>>>>>
>>>>>>            
>>>>> ...
>>>>>
>>>>>          
>>>>>> Can anybody please explain if my observation is correct. This is
a
>>>>>> very
>>>>>> critical aspect for my work.
>>>>>>
>>>>>>            
>>>>> That is correct - the synonym filter can recognize multi-token
>>>>> synonyms
>>>>> from consecutive tokens in a stream.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>
>>>>        
>>>
>>>      
>>    
> 
> 

-- 
View this message in context: http://www.nabble.com/Question-regarding-synonym-tp25720572p25754288.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message