lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5574) NRT Reader close can wipe index it doesn't own
Date Fri, 04 Apr 2014 12:04:15 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959900#comment-13959900
] 

Shai Erera commented on LUCENE-5574:
------------------------------------

I looked at MDW.close() and looks like it already does that -- it opens IW and close it, then
diff the files before and after the close -- so that idea is not new :). That way it protects
against e.g. IFD bugs (or bugs elsewhere where you don't decRef() a file or something).

But perhaps we can make MDW lenient by default (assertNoUnreferencedFilesOnClose=false) and
turn it on in tests where we'd like to catch IFD bugs? Then in the majority of tests we can
code "normally", but in the few tests that need to make sure IW is bullet-proof, we make sure
to close things in order?

Another idea I had is to use the newly added checksums to make sure that the file we're about
to delete has the checksum that we think it should have. Of course, if you re-index the exact
same set of documents in the exact same order, this is still a false positive, but I don't
know how common that is. But then, I'm not sure if it's ok to rely on such logic, and perhaps
the simplest thing we could do is treat the Directory read-only by IndexReader instances.

> NRT Reader close can wipe index it doesn't own
> ----------------------------------------------
>
>                 Key: LUCENE-5574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5574
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.8, 5.0, 4.7.1
>            Reporter: Simon Willnauer
>            Priority: Critical
>             Fix For: 4.8, 5.0
>
>         Attachments: LUCENE-5574.patch, LUCENE-5574.patch, LUCENE-5574.patch
>
>
> Today NRT Readers try to clean up unused files via their IW reference when they are closed.
Yet, if the index writer is already closed another index could have been created on the same
directory which can create the same files as the IW before. For the NRT Reader those files
are not referenced and it will simply wipe them away. If you use this in a replication scenario
where directories are reused this can simply wipe your index away or in combination with the
FSync issue LUCENE-5570 create 0-byte files. I have a test that reproduces this issue



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message