lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/search
Date Mon, 05 Aug 2002 20:23:00 GMT
Scott Ganyo wrote:
> My thought was that the Filter.bits() method on Hits would only resolve the
> BitSet if it was asked for (and probably wouldn't even cache it), so in the
> common case Hits wouldn't suffer any ill effect.  Would that work?  (I feel
> like I'm missing something obvious...)

One could do this, but I'm not sure what the advantage would be.

In your original message on this topic, you wrote:

Scott Ganyo wrote:
 > But instead of adding a new class, why not change Hits to
 > inherit from Filter and add the bits() method to it?
 > Then one could "pipe" the output of one Query into another
 > search without modifying the Queries...

If that's the goal, then a bits() method is not a great way to do this, 
as it ignores the ranking in the first search when ranking the second. 
Since that is a material difference, I prefer to make it explicit.

Filters are not designed for searching within an arbitrary result set. 
For that you really should take the ranking for the first query into 
account: a new query should be formed by adding clauses to the original 
query.  Filters are instead designed to search subsets of an index 
defined by boolean criteria, criteria that do not affect ranking, like 
date, language, postal code, document type, etc.  They are particularly 
useful when the same criterion is used repeatedly, and the bit vector 
can be cached, as the construction and storage of a new bit-vector per 
query is expensive.  Thus the canonical uses of a filter should be to 
implement things like "modified in last week", or "written in english" 
or "in Word format", not a general-purpose "search within results".

Does that make sense?


To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message