lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: Weird query results with edismax and boolean operator +
Date Mon, 30 Apr 2012 23:12:42 GMT
Hi,

I see that you have already commented on SOLR-2649 "MM ignored in edismax queries with operators".
So let's continue the way towards resolution there...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 30. apr. 2012, at 14:28, Vadim Kisselmann wrote:

> I tested it.
> With default "qf=title text" in solrconfig and "mm=100%"
> i get the same result(1) for "nascar AND author:serg*" and "+nascar
> +author:serg*", great.
> With "nascar +author:serg*" i get 3500 matches, in this case the
> mm-parameter seems not to work.
> 
> Here are my debug params for "nascar AND author:serg*":
> 
> </str><str name="querystring">nascar AND author:serg*</str>
> <str name="parsedquery">(+(+DisjunctionMaxQuery((text:nascar |
> title:nascar)~0.01) +author:serg*))/no_coord</str>
> <str name="parsedquery_toString">+(+(text:nascar | title:nascar)~0.01
> +author:serg*)</str><lst name="explain"><str
> name="com.bostonherald/news/international/europe/view/20120409russia_allows_anti-putin_demonstration_in_red_square">
> 8.235954 = (MATCH) sum of:
>  8.10929 = (MATCH) max plus 0.01 times others of:
>    8.031613 = (MATCH) weight(text:nascar in 0) [DefaultSimilarity], result of:
>      8.031613 = score(doc=0,freq=2.0 = termFreq=2.0
> ), product of:
>        0.84814763 = queryWeight, product of:
>          6.6960144 = idf(docFreq=27, maxDocs=8335)
>          0.12666455 = queryNorm
>        9.469594 = fieldWeight in 0, product of:
>          1.4142135 = tf(freq=2.0), with freq of:
>            2.0 = termFreq=2.0
>          6.6960144 = idf(docFreq=27, maxDocs=8335)
>          1.0 = fieldNorm(doc=0)
>    7.7676363 = (MATCH) weight(title:nascar in 0) [DefaultSimilarity],
> result of:
>      7.7676363 = score(doc=0,freq=1.0 = termFreq=1.0
> ), product of:
>        0.9919093 = queryWeight, product of:
>          7.830994 = idf(docFreq=8, maxDocs=8335)
>          0.12666455 = queryNorm
>        7.830994 = fieldWeight in 0, product of:
>          1.0 = tf(freq=1.0), with freq of:
>            1.0 = termFreq=1.0
>          7.830994 = idf(docFreq=8, maxDocs=8335)
>          1.0 = fieldNorm(doc=0)
>  0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
>    1.0 = boost
>    0.12666455 = queryNorm
> </str></lst>
> 
> 
> And here for  "nascar +author:serg*":
> <str name="querystring">nascar +author:serg*</str>
> <str name="parsedquery">(+(DisjunctionMaxQuery((text:nascar |
> title:nascar)~0.01) +author:serg*))/no_coord</str>
> <str name="parsedquery_toString">+((text:nascar | title:nascar)~0.01
> +author:serg*)</str><lst name="explain"><str
> name="com.bostonherald/news/international/europe/view/20120409russia_allows_anti-putin_demonstration_in_red_square">
> 8.235954 = (MATCH) sum of:
>  8.10929 = (MATCH) max plus 0.01 times others of:
>    8.031613 = (MATCH) weight(text:nascar in 0) [DefaultSimilarity], result of:
>      8.031613 = score(doc=0,freq=2.0 = termFreq=2.0
> ), product of:
>        0.84814763 = queryWeight, product of:
>          6.6960144 = idf(docFreq=27, maxDocs=8335)
>          0.12666455 = queryNorm
>        9.469594 = fieldWeight in 0, product of:
>          1.4142135 = tf(freq=2.0), with freq of:
>            2.0 = termFreq=2.0
>          6.6960144 = idf(docFreq=27, maxDocs=8335)
>          1.0 = fieldNorm(doc=0)
>    7.7676363 = (MATCH) weight(title:nascar in 0) [DefaultSimilarity],
> result of:
>      7.7676363 = score(doc=0,freq=1.0 = termFreq=1.0
> ), product of:
>        0.9919093 = queryWeight, product of:
>          7.830994 = idf(docFreq=8, maxDocs=8335)
>          0.12666455 = queryNorm
>        7.830994 = fieldWeight in 0, product of:
>          1.0 = tf(freq=1.0), with freq of:
>            1.0 = termFreq=1.0
>          7.830994 = idf(docFreq=8, maxDocs=8335)
>          1.0 = fieldNorm(doc=0)
>  0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
>    1.0 = boost
>    0.12666455 = queryNorm
> </str>
> <str name="mx.com.elsiglodetorreon/noticia/727525.sacerdotas.html">
> 0.063332275 = (MATCH) product of:
>  0.12666455 = (MATCH) sum of:
>    0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
>      1.0 = boost
>      0.12666455 = queryNorm
>  0.5 = coord(1/2)
> </str>
> 
> 
> You can see, that for first doc in "nascar +author:serg*" all
> query-params match, but in the second doc only
> "ConstantScore(author:serg*)".
> But with an "mm=100%" all query-params should match.
> http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
> http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html
> 
> Best regards
> Vadim
> 
> 
> 
> 2012/4/30 Vadim Kisselmann <v.kisselmann@googlemail.com>:
>> Hi Jan,
>> thanks for your response!
>> 
>> My "qf" parameter for edismax is: "title". My
>> "defaultSearchField=text" in schema.xml.
>> In my app i generate a query with "qf=title,text", so i think the
>> default parameters in config/schema should bei overridden, right?
>> 
>> I found eventually 2 reasons for this behavior.
>> 1. "mm"-parameter in solrconfig.xml for edismax is 0. 0 stands for
>> "OR", but it should be an "AND" => 100%.
>> 2. I suppose that my app does not override my "default-qf".
>> I test it today and report, with my parsed query and all params.
>> 
>> Best regards
>> Vadim
>> 
>> 
>> 
>> 
>> 2012/4/29 Jan Høydahl <jan.asf@cominvent.com>:
>>> Hi,
>>> 
>>> What is your "qf" parameter?
>>> Can you run the three queries with debugQuery=true&echoParams=all and attach
parsed query and all params? It will probably explain what is happening.
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>> 
>>> On 27. apr. 2012, at 11:21, Vadim Kisselmann wrote:
>>> 
>>>> Hi folks,
>>>> 
>>>> i use solr 4.0 from trunk, and edismax as standard query handler.
>>>> In my schema i defined this:  <solrQueryParser defaultOperator="AND"/>
>>>> 
>>>> I have this simple problem:
>>>> 
>>>> nascar +author:serg* (3500 matches)
>>>> 
>>>> +nascar +author:serg* (1 match)
>>>> 
>>>> nascar author:serg* (5200 matches)
>>>> 
>>>> nascar  AND author:serg* (1 match)
>>>> 
>>>> I think i understand the query syntax, but this behavior confused me.
>>>> Why this match-differences?
>>>> 
>>>> By the way, i get in all matches at least one of my terms.
>>>> But not always both.
>>>> 
>>>> Best regards
>>>> Vadim
>>> 


Mime
View raw message