lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: RE : Term Collection Frequency?
Date Wed, 04 Aug 2004 21:03:33 GMT
Grant Ingersoll wrote:
> Once again, I think a generic Metadata Reader/Writer interface would be
> the ideal solution for all of these types of problems.
> See
> I am more than willing to help w/ an implementation, but do not want to
> go it alone w/o some consensus from the committers/Doug that such an
> idea would be accepted as I think the change may be fairly involved.

My concern is that truly generic metadata of this sort would be big and 
slow.  But I'd love to see a proposal that performs well!

Adding, e.g., collection frequency to indexes would not be too hard: 
you'd need to add a field to TermInfo, extend TermInfosWriter, 
DocumentWriter, and SegmentMerger to maintain it, then extend 
SegmentTermEnum, IndexReader, SegmentReader and MultiReader to access 
it.  Indexes would be a little larger and a little slower, but not 

Architecting things so that this same change could be easily made 
without modifying any internals is a much bigger challenge.  And, once 
this is done, making it so that index size and performance is little 
altered is harder yet.  If you have a design that achieves this, please 
share it.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message