lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-7347) Remove queryNorm and coords
Date Mon, 20 Jun 2016 14:15:05 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339551#comment-15339551
] 

Michael McCandless commented on LUCENE-7347:
--------------------------------------------

No modern IR scoring models use coord anymore ... they simply have better term saturation
such that zillions of occurrences of one term, by design, cannot alter the score as much as
one occurrence of one of the other terms in the query.  This makes coord obsolete.

I don't think Lucene should cling to archaic scoring models, especially when this clinging
holds back important improvements, e.g. {{BooleanQuery}} could do more aggressive rewriting
if its hands were not tied by coord.

> Remove queryNorm and coords
> ---------------------------
>
>                 Key: LUCENE-7347
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7347
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>
> These two features are specific to TF-IDF and introduce some complexity (see eg. handling
of coords in BooleanWeight) and bugs/corner-cases (see eg. how taking the query norm into
account causes scoring challenges on LUCENE-7337).
> Since we made BM25 the default in 6.0, I propose that we remove these TF-IDF-specific
features in 7.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message