lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Daubman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring
Date Wed, 27 Nov 2013 13:54:37 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833812#comment-13833812
] 

Aaron Daubman commented on LUCENE-4100:
---------------------------------------

Thanks for the update [~spo] - My particular use-case seems tailor made for this. I have several
decently large (10-30G indices) solr instances, all of which run in read-only mode and are
created ~2x a day via a snapshot process that rolls the index out to load-balanced servers.
Several of these instances routinely match 30-80% (custom MLT-like queries) of the 2-25M docs
in the index per-query, so efficient scoring would be a huge win here.

I already have to patch and custom-build solr for our use (until I get around to creating
required tests to haver SOLR-2052 accepted) and am wondering if you have any thoughts/guidance
on trying out your patch?

The main use-case is from a custom extension of QueryComponent that overrides perpare() and
essentially builds up a custom boosted boolean query and uses rb.setQueryString and rb.setFilters...

> Maxscore - Efficient Scoring
> ----------------------------
>
>                 Key: LUCENE-4100
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4100
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs, core/query/scoring, core/search
>    Affects Versions: 4.0-ALPHA
>            Reporter: Stefan Pohl
>              Labels: api-change, patch, performance
>             Fix For: 4.7
>
>         Attachments: contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient algorithm first
published in the IR domain in 1995 by H. Turtle & J. Flood, that I find deserves more
attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with example queries
and lucenebench, the package of Mike McCandless, resulting in very significant speedups.
> This ticket is to get started the discussion on including the implementation into Lucene's
codebase. Because the technique requires awareness about it from the Lucene user/developer,
it seems best to become a contrib/module package so that it consciously can be chosen to be
used.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message