lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Jose" <>
Subject Index Size
Date Wed, 18 Aug 2004 20:44:37 GMT
I have indexed several thousand (52 to be exact) text files and I keep running out of disk
space to store the indexes.  The size of the documents I have indexed is around 2.5 GB.  The
size of the Lucene indexes is around 287 GB.  Does this seem correct?  I am not storing the
contents of the file, just indexing and tokenizing.  I am using Lucene 1.3 final.  Can you
guys let me know what you are experiencing?  I don't want to go into production with something
that I should be configuring better.  

I am not sure if this helps, but I have a temp index and a real index.  I index the file into
the temp index, and then merge the temp index into the real index using the addIndexes method
on the IndexWriter.  I have also set the production writer setUseCompoundFile to true.  I
did not set this on the temp index.  The last thing that I do before closing the production
writer is to call the optimize method.  

I would really appreciate any ideas to get the index size smaller if it is at all possible.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message