lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Khludnev (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-5460) Allow driving a query by sparse filters
Date Mon, 10 Mar 2014 19:01:46 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13926048#comment-13926048
] 

Mikhail Khludnev edited comment on LUCENE-5460 at 3/10/14 7:00 PM:
-------------------------------------------------------------------

LUCENE-5495 
bq. Really, this is all one giant hack/workaround, because Lucene is unable to properly/generally
handle the "post filter" use case (something Solr has had for some time). I think we should
fix that; i.e., we need some way for a Filter to express that 1) it's random-access (supports
Bits), and 2) it's very costly. 

[~mikemccand] let me disturb you with SampleSlowQuery attached, which : 
* implements post-filtering by SlowQueryScorer.confirm(int)
* can be random-access, however, it's not my favor case, I'd like to post-filter observing
state of underlying leap-frogging scorers 
* allows to handle custom ranking case as well. 

I your feedback is much appreciated! Thanks


was (Author: mkhludnev):
bq. LUCENE-5495 Really, this is all one giant hack/workaround, because Lucene is
unable to properly/generally handle the "post filter" use case
(something Solr has had for some time). I think we should fix that;
i.e., we need some way for a Filter to express that 1) it's random-access
(supports Bits), and 2) it's very costly. 

[~mikemccand] let me disturb you with SampleSlowQuery attached, which : 
* implements post-filtering by SlowQueryScorer.confirm(int)
* can be random-access, however, it's not my favor case, I'd like to post-filter observing
state of underlying leap-frogging scorers 
* allows to handle custom ranking case as well. 

I your feedback is much appreciated! Thanks

> Allow driving a query by sparse filters
> ---------------------------------------
>
>                 Key: LUCENE-5460
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5460
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Shai Erera
>         Attachments: TestSlowQuery.java
>
>
> Today if a filter is very sparse we execute the query in sort of a leap-frog manner between
the query and filter. If the query is very expensive to compute, and/or matching few docs
only too, calling scorer.advance(doc) just to discover the doc it landed on isn't accepted
by the filter, is a waste of time. Since Filter is always the "final ruler", I wonder if we
had something like {{boolean DISI.advanceExact(doc)}} we could use it instead, in some cases.
> There are many combinations in which I think we'd want to use/not-use this API, and they
depend on: Filter's complexity, Filter.cost(), Scorer.cost(), query complexity (span-near,
many clauses) etc.
> I open an issue so we can discuss. DISI.advanceExact(doc) is just a preliminary proposal,
to get an API we could experiment with. The default implementation should be fairly easy and
straightforward, and we could override where we can offer a more optimized imp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message