lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Question on Dismax plugin
Date Mon, 11 Apr 2011 20:35:04 GMT
Hi Raj,

I'm guessing your slug field is much shorter and thus a match in that field has 
more weight than a match is a much longer story field.  If you omit norms for 
those fields in the schema (and reindex), I believe you will see File 4 drop to 
position #4.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: "Nemani, Raj" <Raj.Nemani@turner.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 4:12:52 PM
> Subject: Question on Dismax plugin
> 
> All,
> 
> I have a question on the Dismax plugin for the search handler.   I have
> two test instances of Solr.  In one I am using the default  search
> handler.  In this case, the fields that I am working with (slug  and
> story) are indexed via the all_text filed and the searches are done  on
> the all_text field.
> 
> For the other one I have configured a search  handler using the dismax
> plugin as shown below.
> 
> 
> 
> <requestHandler name="mydismax" class="solr.SearchHandler"  >
> 
>     <lst name="defaults">
> 
>       <str name="defType">dismax</str>
> 
>      <str  name="echoParams">explicit</str>
> 
>      <float  name="tie">0.01</float>
> 
>      <str  name="qf">
> 
>         story^3.0  slug^0.2
> 
>      </str>
> 
>      <int  name="ps">100</int>
> 
>      <str  name="q.alt">*:*</str>
> 
>      </lst>
> 
>    </requestHandler>
> 
> 
> 
> To make testing easier, I only have 4  (same) documents in both indexes
> with the word "Obama" appearing inside as  described below.
> 
> 
> 
> File 1:: The word Obama appears zero times in  "slug" field and four
> times in "story" field
> 
> File 2:: The word Obama  appears zero times in "slug" field and thrice in
> "story" field
> 
> File  3:: The word Obama appears zero times in "slug" field and two times
> in  "story" field
> 
> File 4:: The word Obama appears One time in "slug" field  and one time in
> "story" field
> 
> 
> 
> 
> 
> Here is the order of  the documents in the order of decreasing scores
> from the search  results
> 
> 
> 
> Dismax Search Handler (steadily decreasing  scores):
> 
> *         File 1:: The word Obama appears  zero times in "slug" field and
> four times in "story" field
> 
> *          File 4:: The word Obama appears One time in "slug" field  and
> one time in "story" field
> 
> *         File 2::  The word Obama appears zero times in "slug" field and
> thrice in "story"  field
> 
> *         File 3:: The word Obama appears zero  times in "slug" field and
> two times in "story" field
> 
> 
> 
> Standard  Search handler:
> 
> *         File 1:: The word Obama  appears zero times in "slug" field and
> four times in "story"  field
> 
> *         File 2:: The word Obama appears zero  times in "slug" field and
> thrice in "story" field (same score as File 4 score  below)
> 
> *         File 4:: The word Obama appears One  time in "slug" field and
> one time in "story" field (same score as File 2  score above)
> 
> *         File 3:: The word Obama  appears zero times in "slug" field and
> two times in "story" field
> 
> 
> 
> 
> 
> My question, why is dismax showing "File 4:: The word Obama  appears One
> time in "slug" field and one time in "story" field" 
> 
> ahead  of 
> 
> "File 2:: The word Obama appears zero times in "slug" field and  thrice
> in "story" field" given that I have boosted these fields as shown  below.
> 
> 
>     
> 
> <str name="qf">
> 
>          story^3.0 slug^0.2
> 
> </str>
> 
> 
> 
> I  would have thought that the ""File 4:: The word Obama appears One time
> in  "slug" field and one time in "story" field" would have gone all the
> way done  in the result list.
> 
> 
> 
> Any help is appreciated
> 
> Thanks much  in advance
> 
> Raj
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

Mime
View raw message