lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DM Smith <>
Subject Re: detected corrupted index / performance improvement
Date Wed, 06 Feb 2008 23:15:31 GMT

On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote:

> robert engels wrote:
>> Do we have any way of determining if a segment is definitely OK/ 
>> VALID ?
> The only way I know is the CheckIndex tool, and it's rather slow (and
> it's not clear that it always catches all corruption).

Just a thought. It seems that the discussion has revolved around  
whether a crash or similar event has left the file in an inconsistent  
state. Without looking into how it is actually done, I'm going to  
guess that the writing is done from the start of the file to its end.  
That is, no "out of order" writing.

If this is the case, how about adding a marker to the end of the file  
of a known size and pattern. If it is present then it is presumed that  
there were no errors in getting to that point.

Even with out of order writing, one could write an 'INVALID' marker at  
the beginning of the operation and then upon reaching the end of the  
writing, replace it with the valid marker.

If neither marker is found then the index is one from before the  
capability was added and nothing can be said about the validity.

-- DM

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message