lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Optimizing Filters
Date Fri, 11 Oct 2013 11:33:12 GMT
Are you going to be caching and reusing the filters e.g. by
CachingWrapperFilter?  The main benefit of filters is in reuse.  It
takes time to build them in the first place, likely roughly equivalent
to running the underlying query although with variations as you
describe.  Or are you saying that querying with filters is slow?


--
Ian.


On Thu, Oct 10, 2013 at 7:01 PM, James Clarke <jclarke@basistech.com> wrote:
> Are there any best practices for constructing Filters to search efficiently?
> From my non-exhaustive experiments I cannot intuit how to construct my filters
> to achieve best performance.
>
> I have an index (Lucene 4.3) of about 1.8M documents which contain a field
> acting as a flag (evidence:true). Initially all the documents I am interested in
> searching have this field. Later as the index grows some documents will not have
> this field.
>
> In the simplest case I want to filter on documents with evidence:true. Running a
> couple of hundred queries sequentially and recording how long it takes to
> complete.
>
>  * No filter: ~40s
>  * QueryWrapperFilter(TermQuery(evidence:true)): ~80s
>  * FieldValueFilter(evidence): ~43s
>  * TermsFilter(evidence:true): ~50s
>
> This suggests QWF is a bad idea.
>
> A more complex filter is:
>
>   (evidence:true AND (cid:x OR cid:y ...) AND language:eng)
>
> Where 1.8M documents evidence:true, 2-4 documents per cid clause, 1-60 cid
> clauses, and 1.4M documents lang:eng.
>
> Our initial implementation uses QWF of a BooleanQuery(TQ AND BQ(OR) AND TQ)
> which takes ~210s.
>
> Adjusting this to be a BooleanFilter(TermsFilter AND TermsFilter AND
> TermsFilter) sees things slow down to ~239s!
>
> Any advice on optimizing these filters would be appreciated!
>
> James
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message