lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: A lot of short documents, optimal query?
Date Sun, 13 Nov 2005 01:26:32 GMT

: 2. Filters are perfect for what they do good,
: filtering. But using them for reimplementing
: BooleanQuery, mirroring everything with filters would
: introduce a lot of redundancy. BooleanQuery for exmple
: does a lot of cool optimizations, shortcicuting
: expresins... and more or less the same things would
: have to be reimplemented using filters.

More specifically: the Filter API won't let you write code that can
shortcircut.  ChainedFilter has no way of telling any of it's sub-Filters
"you can skip ahead to document #n" the way BooleanQuery can with it's
sub-Queries.  For arbitrary search criteria entered by users, this means
building up a complex hierarchy of Filters can be less efficient then
building up a complex hierarchy of queries -- even if you don't care about

But for things that you can cache and reuse, Filters kick ass.

: 3. ConstantScoreQuery:
: I am a bit unsure here, but this looks like a bridge
: that enables Filters to enter "regular" Query world.

Correct.  ConstantScoreQuery is just a way to leverage the advantages of a
Filter in situation where need a Query.  ConstantScoreRangeQuery being the
prime example - you can subclass QueryParser to use it.

: 1. Make a TermFilter for all unique, high frequency
: 2. wrap those TermFilters in ConstantScoreQuery,
: 3. combine this inside BooleanQuery as before (Boolean
: mix of term queries and ConstantScoreQueries)

Unless those terms are very frequently used to do searches (in various of
combinations) you may be better off just using regular TermQueries.

Even with the BooleanQuery being smart, and knowing when to skip docs
based on it's sub-queries, what I said before about not being able to
shortcircut a Filter still applies if those Filters are inside


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message