jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (OAK-1104) SegmentNodeStore rebase operation assumes wrong child node order
Date Mon, 21 Oct 2013 12:34:44 GMT
Alex Parvulescu created OAK-1104:

             Summary: SegmentNodeStore rebase operation assumes wrong child node order
                 Key: OAK-1104
                 URL: https://issues.apache.org/jira/browse/OAK-1104
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: core, segmentmk
    Affects Versions: 0.10
            Reporter: Alex Parvulescu

This popped up during the async merge process. The merge first does a rebase which can fail,
making some index files look like they disappeared [0], wrapping the actual root cause.

The problem is that the rebase failed and removed the missing file. This can be seen by analyzing
the ':conflict' marker info:
bq. addExistingNode {_b_Lucene41_0.doc, _b.fdx, _b.fdt, _b_4.del, }
so it points to something trying to add some index related files twice, almost like a concurrent
commit exception.

Digging even deeper I found that the rebase operation during the state comparison phase assumes
a certain order of child nodes [1], and based on that tries to read the mentioned nodes again,
thinking that they are new ones, when if fact they are already present in the list [2].
This causes a conflict which fails the entire async update process, but also any lucene search,
as the index files are now gone and the index is in a corrupted state.

*WARN* [pool-5-thread-2] org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Index update
async failed org.apache.jackrabbit.oak.api.CommitFailedException: OakLucene0004: Failed to
close the Lucene index
	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:122)
	at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:64)
	at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:64)
	at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:129)
	at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:100)
	at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
	at org.quartz.core.JobRunShell.run(JobRunShell.java:207)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException: _b_Lucene41_0.doc at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory.openInput(OakDirectory.java:145)

[1] http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/MapRecord.java?view=markup#l329

before child list
[_b_Lucene41_0.doc, _b.fdx, _b.fdt, segments_34, _b_4.del, _b_Lucene41_0.pos, _b.nvm, _b.nvd,
_b.fnm, _3n.si, _b_Lucene41_0.tip, _b_Lucene41_0.tim, _3n.cfe, segments.gen, _3n.cfs, _b.si]

after list
_b_Lucene41_0.pos, _3k.cfs, _3j_1.del, _b.nvm, _b.nvd, _3d.cfe, _3d.cfs, _b.fnm, _3j.si, _3h.si,
_3i.cfe, _3i.cfs, _3e_2.del, _3f.si, _b_Lucene41_0.tip, _b_Lucene41_0.tim, segments.gen, _3e.cfe,
_3e.cfs, _b.si,_3g.si, _3l.si, _3i_1.del, _3d_3.del, _3e.si, _3d.si, _b_Lucene41_0.doc, _3h_2.del,
_3i.si, _3k_1.del, _3j.cfe, _3j.cfs, _b.fdx, _b.fdt, _3g_1.del, _3k.si, _3l.cfe, _3l.cfs,
segments_33, _3f_1.del, _3h.cfe, _3h.cfs, _b_4.del, _3f.cfe, _3f.cfs, _3g.cfe, _3g.cfs

This message was sent by Atlassian JIRA

View raw message