nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf>
Subject Re: [Nutch-dev] Adding title and site to scoring
Date Tue, 22 Mar 2005 22:54:02 GMT

> Lucene offers you possibility to boost paricular fields
> in a document but this functionality is not used in nutch (as far as I 
> can tell).

First this statement confuse me, but after digging in the sources I 
would say you are right.
I missed that until writing my last mail and was thinking nutch use 
field boosting as well since this is one of the cool features of 
I simply oversee that the basic query filter only search in a set of 
fields, since I was just too much  thinking in the use case of my 
shortly submitted plugin.

Anyway I have two thoughts  in general.
Some days ago I was reading "Why Writing Your Own Search Engine Is 
Hard" by Anna Patterson and one of the repeated suggestions was 
something like "pre calculate ranking where ever you can".
Well I'm not able to say if this translatable to nutch or lucene in 
general, may only Doug or Erik can give any comment.
So I would be interested to know if boosting until index time is faster 
until search time then using boosting until query generation.

The other thought is, since now there are some people think about 
ranking, I asking myself what kind of abstractions we can do to make 
ranking improvements more easily to implement.
First I would say it make sense to make boosting values configurable 
since some people can play around and report experience.
Furthermore I would be interested to see a discussion if and where we 
may can create a extension point that allows implementing of ranking 
This would be may be very interesting for the research community to 
play around with it.

I'm very interested to see your patch tomorrow. :-)


View raw message