lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lance Norskog (JIRA)" <>
Subject [jira] Commented: (LUCENE-1360) A Similarity class which has unique length norms for numTerms <= 10
Date Thu, 27 Jan 2011 21:14:44 GMT


Lance Norskog commented on LUCENE-1360:

bq. Lance, this is a bit misleading. only lengths {3,4} , {6,7}, and {8,9,10} share the same
I thought I got them all the same when I tested with Lucene 2.9, but ok.
bq. For most uses, this isn't really that big of a deal that a few numbers quantize to the
same bytes.
The problem is then the curve of how much field norms affect boosting. 

Sure, close this. My goal is to make Solr work smoothly in all environments.


> A Similarity class which has unique length norms for numTerms <= 10
> -------------------------------------------------------------------
>                 Key: LUCENE-1360
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Query/Scoring
>            Reporter: Sean Timm
>            Assignee: Otis Gospodnetic
>            Priority: Trivial
>         Attachments: LUCENE-1380 visualization.pdf,
> A Similarity class which extends DefaultSimilarity and simply overrides lengthNorm. 
lengthNorm is implemented as a lookup for numTerms <= 10, else as {{1/sqrt(numTerms)}}.
This is to avoid term counts below 11 from having the same lengthNorm after stored as a single
byte in the index.
> This is useful if your search is only on short fields such as titles or product descriptions.
> See mailing list discussion:

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message