lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dotan Cohen <dotanco...@gmail.com>
Subject Exponential omitNorms
Date Wed, 07 Nov 2012 09:15:44 GMT
Hi all! One area when I am applying Solr deals with variable-length
posts by users: think of things from one word posts ("Cool!" with an
attached photo) to blog-post length (500-1000 words). Due to Field
Normalization, the short posts get the highest Solr score, while the
long, informative posts are pushed to the end of the results.
Therefore I am moving to remove Field Normalization with
omitNorms=true. However, there do exist the craft users who stuff
hundreds of irrelevant words together to get noticed. Therefore, I
would like to try some sort of exponential (or even linear) Field
Normalization where no normalization is performed on one-word
documents but the longest documents (over a few hundred words) get
some penalty.

Are there any facilities in Solr for performing this? I would of
course prefer query-time computation as that let me do other things
with the data, but if only index-time computation is possible the I
can accept that.

Thank you!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

Mime
View raw message