lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Jungermann <patrick.jungerm...@googlemail.com>
Subject Re: multi-word synonyms and analysis.jsp vs real field analysis (query, index)
Date Fri, 09 Oct 2009 14:09:00 GMT
Hi Koji,

the problem is, that this doesn't fit all of our requirements. We have
some Solr documents that must not be matched by "foo" or "bar" but by
"foo bar" as part of the query. Also, we have some other documents that
could be matched by "foo" and "foo bar" or "bar" and "foo bar".

The best way to handle this, seems to be by using synonyms that allows
the precise configuration of this and that could be managed by an
editorial staff.

Besides, foo bar=>foo_bar works at anything (index time, analysis.jsp)
but query time.


Patrick


Koji Sekiguchi schrieb:
> Hi Patrick,
> 
> Why don't you define:
> 
> foo bar, foo_bar (and expand="true")
> 
> instead of:
> 
> foo bar=>foo_bar
> 
> in only indexing side? Doesn't it make a change for the better?
> 
> Koji
> 
> 
> Patrick Jungermann wrote:
>> Hi Koji,
>>
>> using phrase queries is no alternative for us, because all query parts
>> has to be optional parts. The phrase query workaround will work for a
>> query "foo bar", but only for this exact query. If the user queries for
>> "foo bar baz", it will be changed to "foo_bar baz", but it will not
>> match the indexed documents that only contains "foo_bar". And this is,
>> what we need here.
>>
>> The cause of my problem should be the query parsing, but I don't know,
>> if there is any solution for it. I need a possibility that works like
>> the analysis/query parsing within /admin/analysis.jsp view.
>>
>>
>> Patrick
>>
>>
>>
>> Koji Sekiguchi schrieb:
>>  
>>> Patrick,
>>>
>>>    
>>>> parsedQueryString was something similar to "field:foo field:bar". At
>>>> index time, it works like expected.
>>>>       
>>> I guess because you are searching q=foo bar, this causes OR query.
>>> Use q="foo bar", instead.
>>>
>>> Koji
>>>
>>>
>>> Patrick Jungermann wrote:
>>>    
>>>> Hi list,
>>>>
>>>> I worked on a field type and its analyzing chain, at which I want to
>>>> use
>>>> the SynonymFilter with entries similar to:
>>>>
>>>> foo bar=>foo_bar
>>>>
>>>> During the analysis phase, I used the /admin/analysis.jsp view to test
>>>> the analyzing results produced by the created field type. The output
>>>> shows that a query "foo bar" will first be separated by the
>>>> WhitespaceTokenizer to the two tokens "foo" and "bar", and that the
>>>> SynonymFilter will replace the both tokens with "foo_bar". But as I
>>>> tried this at "real" query time with the request handler "standard" and
>>>> also with "dismax", the tokens "foo" and "bar" were not replaced. The
>>>> parsedQueryString was something similar to "field:foo field:bar". At
>>>> index time, it works like expected.
>>>>
>>>> Has anybody experienced this and/or knows a workaround, a solution for
>>>> it?
>>>>
>>>>
>>>> Thanks, Patrick
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>
>>
>>   
> 


Mime
View raw message