lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: Lucene's default settings & back compatibility
Date Fri, 22 May 2009 18:26:04 GMT
On Fri, May 22, 2009 at 09:06:32PM +0400, Earwin Burrfoot wrote:
> > In KinoSearch SVN trunk, satellite classes like QueryParser and Highlighter
> > have to be passed a Schema, which contains all the Analyzers.  Analyzers
> > aren't satellite classes under this model -- they are a fixed property of a
> > FullTextType field spec.  Think of them as baked into an SQL field definition.
> >
> > You can create a Schema from scratch to pass to the QueryParser, but it's
> > easier to just get it from the Searcher.  Translating to Java...
> >
> >   Searcher searcher = new Searcher("/path/to/index");
> >   QueryParser qparser = new QueryParser(searcher.getSchema());
> >
> > I don't see how that's so different from getting an analyzer actsAsVersion
> > number from the index.
> >
> > Now, where stuff might start to get complicated is PerFieldAnalyzerWrapper...
> > is that where the sneakiness gets overwhelming?
> Some people can have setups more complex than that.
> Different analyzers per field.

Heh.  One of the primary rationales behind Schema was to tie individual
analyzers to specific fields.

> Custom analyzers.

No problem.

> Several indexes using the same analyzer.

No problem.  Only necessary if the analyzer is costly or has some esoteric
need for shared state.  And possible via subclassing Schema or Analyzer.

> Intentionally different analyzers for indexing and searching.

No problem.  That only makes sense in the context of QueryParser, and the KS
QueryParser allows you to supply an analyzer which overrides the Schema.

> Using this analyzer without any index at all - like I do highlight on
> a separate machine to minimize GC pauses, or tag docs by running a
> heap of queries against MemoryIndex.

No problem.  Distribute a Schema subclass among several machines.

These are all solved problems under the per-index field semantics serialized
Schema model.  That's why I said it was the "theoretical solution".

Marvin Humphrey

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message