lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Woodward (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-8633) Remove term weighting from interval scoring
Date Fri, 11 Jan 2019 10:23:00 GMT
Alan Woodward created LUCENE-8633:
-------------------------------------

             Summary: Remove term weighting from interval scoring
                 Key: LUCENE-8633
                 URL: https://issues.apache.org/jira/browse/LUCENE-8633
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Alan Woodward
            Assignee: Alan Woodward
         Attachments: LUCENE-8633.patch

IntervalScorer currently uses the same scoring mechanism as SpanScorer, summing the IDF of
all possibly matching terms from its parent IntervalsSource and using that in conjunction
with a sloppy frequency to produce a similarity-based score.  This doesn't really make sense,
however, as it means that terms that don't appear in a document can still contribute to the
score, and appears to make scores from interval queries comparable with scores from term or
phrase queries when they really aren't.

I'd like to explore a different scoring mechanism for intervals, based purely on sloppy frequency
and ignoring term weighting.  This should make the scores easier to reason about, as well
as making them useful for things like proximity boosting on boolean queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message