lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: Question about FieldInfos
Date Sun, 15 Jan 2006 01:43:41 GMT

On Jan 14, 2006, at 5:45 PM, Robert Kirchgessner wrote:

> Well, I thing merging segments should be possible only if
> the field definitions are consistent throughout the segments.
> Merging inconsistent segments looks for me like an error at worst
> and bad design at least. But I may just not have met an
> appropriate use case yet...

Lucene allows the user to change field definitions on the fly.   
That's like an SQL database which auto-adapts the table definition  
with each INSERT.  It's impressive that Lucene can do that, but look  
under the hood and you'll see that it ain't easy, or cheap.

Significant, probably substantial performance gains are possible if  
field definitions are frozen per-IndexWriter.  That's what KinoSearch  
does, and it's the primary reason that it's an order of magnitude  
faster than Plucene at building indexes.

The only problem I see with freezing field definitions per-index is  
for document collections that are difficult to re-index from scratch,  
but that might require occasional field-definition changes.  I  
suspect that's an edge case.  Anyone care to disabuse me of that  
notion?  Probably some people who are doing large-scale web-spidering  
would be impacted.

My radical suggestion:

     * Require fields to be defined when the index
       is first created.
     * Store field definitions in a single per-index,
       human-readable file.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message