lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Lucene's default settings & back compatibility
Date Tue, 19 May 2009 08:34:27 GMT

>When you create IndexReader, IndexWriter and others, you must pass in a Settings
> instance.

I think this would also help solve the steady growth of constructor variations (18 in 2.4's
IndexWriter vs 3 in Lucene 1.9).






----- Original Message ----
From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
To: java-dev@lucene.apache.org
Sent: Tuesday, 19 May, 2009 2:43:08
Subject: Re: Lucene's default settings & back compatibility


Me like!

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Michael McCandless <lucene@mikemccandless.com>
> To: java-dev@lucene.apache.org
> Sent: Monday, May 18, 2009 5:06:39 PM
> Subject: Lucene's default settings & back compatibility
> 
> As we all know, Lucene's back-compat policy necessarily hurts the
> out-of-the-box experience for new users: because we are only allowed
> make substantial improvements to Lucene's default settings at a major
> release, new users won't see the improvements to our settings until a
> major release (typically years apart).
> 
> Lucene has a number of default settings, eg some recent examples:
> 
>   * Read-only IndexReader gives better much performance with threads,
>     yet we must now default IndexReader.open to return a non-readOnly
>     reader
> 
>   * We can now optionally turn off scoring when sorting by field
>     (sizable speed gain), but we had to leave it on by default until
>     3.0
> 
>   * Letting IndexReader.norms return null
> 
>   * LogMergePolicy now takes deletions into account, but we had to
>     disable it by default, since it could conceivably break back
>     compat.
> 
>   * Bug fixes in StandardAnalyzer must be delayed until 3.0 since
>     there's a remote chance they'd break back compat in an app, or we
>     end up adding confusing methods like "public static void
>     setDefaultReplaceInvalidAcronym".
> 
>   * NIOFSDirectory ought to be "the default" on UNIX, but it's not
> 
>   * Constant score rewrite ought to be the default for most multi-term
>     queries
> 
>   * StopFilter should enable position increments by default
> 
> The fact that we are "forced" delay such "out of the box" improvements
> to Lucene for so long is a frustrating cost, since it can only stunt
> Lucene's adoption and growth and my sense is that it's a minority of
> Lucene's users that need such strict back-compat (this has been
> discussed before).  It also clutters our APIs because we end up
> creating setter/getters that often only exist for the sake of a back
> compat preservation of a bug.
> 
> I think we can fix this.  Ie, maintain our strong back-compat policy,
> yet still allow new users to experience the best of Lucene on every
> release (not just on major releases), by creating an explicit class
> that holds settings/defaults used by Lucene.
> 
> For example, say we create a base class named Settings.  It holds the
> defaults for settings across all of Lucene's classes. When you create
> IndexReader, IndexWriter and others, you must pass in a Settings
> instance.
> 
> A subclass, SettingsMatching24, binds all settings to "match" 2.4's
> behavior.  When we make improvements in 2.9, we'd add the back-compat
> settings to SettingsMatching24.  So if your app wants to keep exactly
> 2.4's behavior, you'd pass in SettingsMatching24().  On upgrading to
> 2.9 you'd still see 2.4's behavior.
> 
> Users who'd like to see Lucene's improvements on each minor release
> would instead instantiate LatestAndGreatestSettings() (or
> CurrentVersionSettings(), or something), understanding that when they
> upgrade there might be biggish changes to Lucene's defaults.  My guess
> is most users would use this settings class.
> 
> Doug actually suggested this exact idea a while back:
> 
>  http://www.gossamer-threads.com/lists/lucene/java-dev/54421#54421.
> 
> Now that I realize we could use this to strongly decouple "users
> wanting precise back-compat" from "users wanting the latest &
> greatest", I think it's a very compelling solution.
> 
> If we do this I'd like to do it in 2.9, so that starting with 3.x we
> are free to change default settings w/o breaking back compat.
> 
> Thoughts?
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message