lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: Optimization and Corruption Issues
Date Thu, 01 Oct 2009 15:56:41 GMT
lowfreq wrote:
> I have a Lucene index that is very large in size. 
> It was created using a pre 2.1 version of 
> The index is currently almost 20 GB, and has almost 7000 segment files. 
> The problem I am having is that I need to optimize it, and cant do this
> without the search functionality of my app being down for a week. 
> I used the Luke tool from and it worked flawlessly, optimizing
> the index in just over 2 hours. Problem is that my search cannot use it, and
> the error states Unknown Format Version errors, or just plain nothing found. 

You should be careful when using Lucene Java to modify Lucene.Net 
indexes. I know for a fact that deflated data in Lucene Java is 
incompatible with the deflater implementation in .Net, so it's easy to 
create an incompatible index even when you use a supposedly compatible 
version of Lucene Java. Perhaps versions around 2.0 still worked ok, but 
no guarantees.

> I understand that versions of Lucene that are newer than what the index was
> built and is searched with can cause problems. 
> What can I do to make this work? I have tried older versions of Luke, 0.7
> was the oldest I could lay hands on, but even it uses a newer version of
> Lucene. 

Here are links to older versions of Luke:

> My index version shows as 633103800023469045. The version the index is
> written as after optimizing with Luke 7.0 is 633103800023469057. 

This is just a timestamp, so it doesn't say what version of Lucene 
created the index. If you open the index with Luke, in the Overview tab 
there is a line that tells what is the index format version.

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message