lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3381) Sandbox remaining contrib queries
Date Thu, 18 Aug 2011 09:53:28 GMT


Robert Muir commented on LUCENE-3381:

To work across different scoring systems generically I expect IDF-tweakage would need to be
made a pluggable aspect of all these scoring strategies e.g. through a common interface. Messy.

I don't think think we need to do that?
I added a comment to the source code:
    // TODO: generalize this query (at least it should not reuse this static sim!
    // a better way might be to convert this into multitermquery rewrite methods.
    // the rewrite method can 'average' the TermContext's term statistics (docfreq,totalTermFreq)

    // provided to TermQuery, so that the general idea is agnostic to any scoring system...

I don't think this is really that hard, nor messy? 
Then this Query just invokes rewrite() to a BooleanQuery of ordinary fuzzyqueries, setting
its custom rewrite methods (it looks like we need to implement 2 here, depending upon configuration)
on each.

The rewrite methods would average docfreq and totaltermfreq (the only two "collection-wide"
term statistics lucene supports), and set these in the TermContexts that they pass to TermQuery.
Then the concept works for all scoring systems.

As a side benefit, this would give some performance benefits anyway since by doing this, the
term rewrite will become single pass instead of doing wasted seeks per-segment * per-term.

> Sandbox remaining contrib queries
> ---------------------------------
>                 Key: LUCENE-3381
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Chris Male
>         Attachments: LUCENE-3381.patch
> In LUCENE-3271, I moved the 'good' queries from the queries contrib to new destinations
(primarily the queries module).  The remnants now need to find their home.  As suggested in
LUCENE-3271, these classes are not bad per se, just odd.  So lets create a sandbox contrib
that they and other 'odd' contrib classes can go to.  We can then decide their fate at another

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message