lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: schema-based Index-time field boosting
Date Thu, 03 Dec 2009 09:15:06 GMT
Ian - now you're talking *term* boosting, which is a dynamic query- 
time factor, not something specified at index time.

Here's what I suggest as a starting point for this sort of thing, in  
Solr request format:

    http://localhost:8983/solr/select? 
defType=dismax&q=apple&qf=name^2+manu

Where the term "apple" is queried against both the name and  
manu(facturer) fields.  And matches in the name field get boosted by a  
factor of 2.  This is using the dismax query parser.

Using index-time boosts are becoming less and less favorable - rarely  
any need to do that given the more flexible dynamic control you can  
have over scoring at query-time.

And I'm sure Hoss isn't arguing against field boosting, given he's one  
of the gurus behind the magic of dismax.  He's simply saying if you  
apply a constant boost to all documents, you've effectively done  
nothing.

	Erik

On Dec 3, 2009, at 3:37 AM, Ian Smith wrote:

> Aaaaaaaargh!  OK, I would like a document with (eg.) a title  
> containing
> a term to score higher than one on (eg.) a summary containing the same
> term, all other things being "equal".  You seem to be arguing against
> field boosting in general, and I don't understand why!
>
> May as well let this drop since we don't seem to be talking about the
> same thing . . . but thanks anyway,
>
> Ian.
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 30 November 2009 23:05
> To: solr-user@lucene.apache.org
> Subject: RE: schema-based Index-time field boosting
>
>
> : I am talking about field boosting rather than document boosting,  
> ie. I
> : would like some fields (say eg. title) to be "louder" than others,
> : across ALL documents.  I believe you are at least partially talking
> : about document boosting, which clearly applies on a per-document
> basis.
>
> index time boosts are all the same -- it doesn't matter if htey are
> field boosts or document boosts -- a document boost is just a field
> boost for every field in the document.
>
> : If it helps, consider a schema version of the following, from
> : org.apache.solr.common.SolrInputDocument:
> :
> :   /**
> :    * Adds a field with the given name, value and boost.  If a field
> with
> : the name already exists, then it is updated to
> :    * the new value and boost.
> :    *
> :    * @param name Name of the field to add
> :    * @param value Value of the field
> :    * @param boost Boost value for the field
> :    */
> :   public void addField(String name, Object value, float boost )
>
> 	...
>
> : Where a constant boost value is applied consistently to a given  
> field.
> : That is what I was mistakenly hoping to achieve in the schema.  I
> still
> : think it would be a good idea BTW.  Regards,
>
> But now we're right back to what i was trying to explain before: index
> time boost values like these are only used as a multiplier in the
> fieldNorm.  when included as part of the document data, you can  
> specify
> a fieldBoost for fieldX of docA that's greater then the boost for  
> fieldX
> of docB and that will make docA score higher then docB when fieldX
> contains the same number of matches and is hte same length -- but if  
> you
> apply a constant boost of B to fieldX for every doc (which is what a
> feature to hardcode boosts in schema.xml might give you) then the net
> effect would be zero when scoring docA and docB, because the  
> fieldNorm's
> for fieldX in both docs would include the exact same multiplier.
>
>
>
> -Hoss
>


Mime
View raw message