lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Lucene's default settings & back compatibility
Date Tue, 19 May 2009 10:39:34 GMT
On Mon, May 18, 2009 at 8:51 PM, Yonik Seeley
<> wrote:
> On Mon, May 18, 2009 at 5:06 PM, Michael McCandless
> <> wrote:
>>  * StopFilter should enable position increments by default
> Is this one an actual improvement in the general case?
> A query of "foo bar" then wouldn't match a document with "foo and
> bar", but a query of "foo the bar" would.

Well... I think I'd argue that this is an improvement, ie the query
"foo bar" should not in fact match a doc with "foo and bar" (unless
your PhraseQuery is using slop).  If you really want slop in your
matching, you should just use slop.

Query "foo the bar" will match document "foo and bar" in either case,
so it's non-differentiating here.

Also, it's bothersome that by default StopFilter throws away more
information than it needs to.  Ie, it's already discarding words
(that's its purpose) but the fact that it then also discards the holes
left behind, by default, is not good, I think.

I went and re-read
Since both QueryParser and StopFilter can now preserve position
increments, I'd think we would want to change both to do so (in the
*Settings classes)?

(And, QueryParser is another great example where a *Settings class
would give us much more freedom to fix its quirks w/o breaking back

Anyway, this is a great debate, in that any defaults set in Lucene
over time should be scrutinized, through discussions like this, rather
than simply always forcefully left on their back-compat defaults.  The
Settings class would give us this freedom.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message