lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kelvin Tan" <>
Subject Re: Question on the FAQ list with filters
Date Thu, 28 Mar 2002 03:17:12 GMT
The API provides the Filter mechanism for filtering out hits before they are

The alternative is to write your own classes to filter out documents
returned after searching.
If its not efficient to check on every single document, or the results are
not obtained in batch, then this method is probably better.

I currently run a query through my database to return a list of documents
which a particular user is allowed to access. The Filter method, thus makes
a good deal of sense for me, since I'm able to obtain the results in batch.

>From my interpretation of the FAQ, it seems that you're expected to write
your own code to perform post-search filtering and not use/subclass the
Filter class. Of course, this could be made slightly clearer...


----- Original Message -----
From: "Armbrust, Daniel C." <>
To: "'Lucene Users List'" <>
Sent: Thursday, March 28, 2002 5:52 AM
Subject: Question on the FAQ list with filters

> From the FAQ:
> ***
> 16. What is filtering and how is it performed ?
> Filtering means imposing additional restriction on the hit list to
> hits that otherwise would be included in the search results. There are two
> ways to filter hits:
> * Search Query - in this approach, provide your custom filter object to
> when you call the search() method. This filter will be called exactly once
> to evaluate every document that resulted in non zero score.
> * Selective Collection - in this approach you perform the regular search
> when you get back the hit list, collect only those that matches your
> filtering criteria. In this approach, your filter is called only for hits
> that returned by the search method which may be only a subset of the non
> zero matches (useful when evaluating your search filter is expensive).
> ***
> I don't see why the second way is useful.  Yes, your filter is called only
> for hits that got returned by the search method, but aren't those the same
> hits that the search() method would run through the filter?  Maybe I'm
> not reading it close enough.
> Is my assumption that it is faster to provide a filter to the search()
> method, than to do a selective collation correct?
> --
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message