lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption
Date Fri, 11 Jan 2019 09:07:00 GMT


Simon Willnauer commented on LUCENE-8525:

I do agree with [~rcmuir] here. There is not much to do in terms of detecting this particular
problem on DataInput and friends. One way to improve this would certainly be the wording on
the java doc. We can just clarify that detecting _CorruptIndexException_ is best effort.

Another idea is to checksum the entire file before we read the commit we can either do this
on the Elasticsearch end or improve _SegmentInfos#readCommit_ . Reading this file twice isn't
a big deal I guess.

> throw more specific exception on data corruption
> ------------------------------------------------
>                 Key: LUCENE-8525
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Vladimir Dolzhenko
>            Priority: Major
> DataInput throws generic IOException if data looks odd
> [DataInput:141|]
> there are other examples like [BufferedIndexInput:219|],
and maybe [DocIdsWriter:81|]
> That leads to some difficulties - see [elasticsearch #34322|]
> It would be better if it throws more specific exception.
> As a consequence [SegmentInfos.readCommit|]
violates its own contract
> {code:java}
> /**
>    * @throws CorruptIndexException if the index is corrupt
>    * @throws IOException if there is a low-level IO error
>    */
> {code}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message