lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: Lucene's default settings & back compatibility
Date Fri, 22 May 2009 16:52:08 GMT
On Fri, May 22, 2009 at 11:33:33AM -0400, Michael McCandless wrote:

> when working on 3.1 if we make some great improvement, I'd like new users in
> 3.1 to see the improvement by default.  

Sounds like an argument for more frequent major releases.  But I'm not exactly
one to talk.  ;)

> On thinking about it more... automagically storing the "actsAsVersion"
> in the index, and then having IndexWriter (for example) ask the
> analyzer for a tokenStream matching that version, seems a little too
> sneaky.  

Can you elaborate?

In KinoSearch SVN trunk, satellite classes like QueryParser and Highlighter
have to be passed a Schema, which contains all the Analyzers.  Analyzers
aren't satellite classes under this model -- they are a fixed property of a
FullTextType field spec.  Think of them as baked into an SQL field definition.

You can create a Schema from scratch to pass to the QueryParser, but it's
easier to just get it from the Searcher.  Translating to Java... 

   Searcher searcher = new Searcher("/path/to/index");
   QueryParser qparser = new QueryParser(searcher.getSchema());

I don't see how that's so different from getting an analyzer actsAsVersion
number from the index.

Now, where stuff might start to get complicated is PerFieldAnalyzerWrapper...
is that where the sneakiness gets overwhelming?

> I prefer the up-front "you specify actsAsVersion" when you
> create the analyzer, only for analyzers that have changed across
> releases.  So things like WhitespaceAnalyzer would likely never need
> an actsAsVersion arg.

Hmm, this is kind of hard.  I'd prefer that the argument remain optional, so
that new users don't have to think about it.  But unlike in KS/Lucy, then
there's a danger of leaving it off inadvertently and getting the wrong
behavior. :\

Marvin Humphrey

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message