lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Zhang <smartag...@gmail.com>
Subject Re: Question about field boost
Date Tue, 23 Jul 2013 04:52:05 GMT
Is my reading correct that the boost is only applied on "china" but not
"snowden"? How can that be?

My query is: q=china+snowden&qf=title^10 content


On Mon, Jul 22, 2013 at 9:43 PM, Joe Zhang <smartagent@gmail.com> wrote:

> Thanks for your hint, Jack. Here is the debug results, which I'm having a
> hard deciphering (the two terms are "china" and "snowden")...
>
> 0.26839527 = (MATCH) sum of:
>   0.26839527 = (MATCH) sum of:
>     0.26757246 = (MATCH) max of:
>       7.9147343E-4 = (MATCH) weight(content:china in 249), product of:
>         0.019873314 = queryWeight(content:china), product of:
>           1.6649085 = idf(docFreq=46832, maxDocs=91058)
>           0.01193658 = queryNorm
>         0.039825942 = (MATCH) fieldWeight(content:china in 249), product
> of:
>           4.8989797 = tf(termFreq(content:china)=24)
>           1.6649085 = idf(docFreq=46832, maxDocs=91058)
>           0.0048828125 = fieldNorm(field=content, doc=249)
>       0.26757246 = (MATCH) weight(title:china^10.0 in 249), product of:
>         0.5836803 = queryWeight(title:china^10.0), product of:
>           10.0 = boost
>           4.8898454 = idf(docFreq=1861, maxDocs=91058)
>           0.01193658 = queryNorm
>         0.45842302 = (MATCH) fieldWeight(title:china in 249), product of:
>           1.0 = tf(termFreq(title:china)=1)
>           4.8898454 = idf(docFreq=1861, maxDocs=91058)
>           0.09375 = fieldNorm(field=title, doc=249)
>     8.2282536E-4 = (MATCH) max of:
>       8.2282536E-4 = (MATCH) weight(content:snowden in 249), product of:
>         0.03407834 = queryWeight(content:snowden), product of:
>           2.8549502 = idf(docFreq=14246, maxDocs=91058)
>           0.01193658 = queryNorm
>         0.024145111 = (MATCH) fieldWeight(content:snowden in 249), product
> of:
>           1.7320508 = tf(termFreq(content:snowden)=3)
>           2.8549502 = idf(docFreq=14246, maxDocs=91058)
>           0.0048828125 = fieldNorm(field=content, doc=249)
>
>
> On Mon, Jul 22, 2013 at 9:27 PM, Jack Krupansky <jack@basetechnology.com>wrote:
>
>> Maybe you're not doing anything wrong - other than having an artificial
>> expectation of what the true relevance of your data actually is. Many
>> factors go into relevance scoring. You need to look at all aspects of your
>> data.
>>
>> Maybe your terms don't occur in your titles the way you think they do.
>>
>> Maybe you need a boost of 500 or more...
>>
>> Lots of potential maybes.
>>
>> Relevancy tuning is an art and craft, hardly a science.
>>
>> Step one: Know your data, inside and out.
>>
>> Use the debugQuery=true parameter on your queries and see how much of the
>> score is dominated by your query terms in the non-title fields.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Joe Zhang
>> Sent: Monday, July 22, 2013 11:06 PM
>> To: solr-user@lucene.apache.org
>> Subject: Question about field boost
>>
>>
>> Dear Solr experts:
>>
>> Here is my query:
>>
>> defType=dismax&q=term1+term2&**qf=title^100 content
>>
>> Apparently (at least I thought) my intention is to boost the title field.
>> While I'm getting some non-trivial results, I'm surprised that the
>> documents with both term1 and term2 in title (I know such docs do exist in
>> my repository) were not returned (or maybe ranked very low). The situation
>> does not change even when I use much larger boost factors.
>>
>> What am I doing wrong?
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message