lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <>
Subject Re: DbDirectory and compound files
Date Thu, 30 Sep 2004 06:39:18 GMT

You ask if this makes sense. No, not really. I don't know the details of the 
purpose of the compound file implementation so this may be my problem.

I understand the notion of a compound file as a file that contains other 
files, a file as a directory in a sense. In that sense, the DbDirectory 
implementation is implemented with two files - two dbs - the file containing 
the file names and the file containing the data files and it seems that in 
that context, compound files don't add much.

But your reply seems to allude to other purposes, like 'combining the files 
of a segment' that I don't understand since I don't know anything about the 
low-level implementation of Lucene indexes. I think I should learn more about 
that before continuing this conversation.

However, from earlier posts of yours, it seems that the Directory 
implementation classes such as OutputStream et al are being deprecated and 
replaced by others, so it may very well be that DbDirectory needs to be 
rewritten when these changes are finalized. Until then, it is probably moot to 
spend more time on the Compound-File-with-DbDirectory issue which can be 
worked around by not using them for the time being.


On Wed, 29 Sep 2004, Doug Cutting wrote:

> Andi Vajda wrote:
>> So, my question: why is the compound file storage implemented in such an 
>> orthogonal to Directory way instead of just being another Directory 
>> implementation called FSCompoundFileDirectory ?
> To combine the files of a segment we need to know when the segment was 
> complete.  So a method would need to be added to Directory to instruct it 
> when to combine files.  And then the Directory would need to be able to 
> locate files within the combined file in order to open them.
> It would be a shame to re-invent this logic for each Directory 
> implementation, so the indexing code has a generic implementation layered on 
> top of Directory.  Does that make sense?
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message