jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-7852) Blocked background flush can cause sever data loss
Date Fri, 19 Oct 2018 09:35:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656521#comment-16656521

Michael Dürig commented on OAK-7852:

The problem is with the thread affinity in {{SegmentBufferWriterPool}}: if each update to
the store happens on a new thread each one would receive its own {{SegmentBufferWriter}}.
Without background flushes those {{SegmentBufferWriter}} will not be flushed and lose its
latest segment on an unclean shutdown.

See [https://github.com/mduerig/jackrabbit-oak/commit/877dc25f8e7cb18db57f74f0685b2af40b585050]
for an initial test case simulating this situation. To be able to simulate a blocked background
thread I had to hack open access to {{FileStore#fileStoreScheduler}}. Not sure whether we
really want to do this or whether there are better ways to handle this. 


> Blocked background flush can cause sever data loss 
> ---------------------------------------------------
>                 Key: OAK-7852
>                 URL: https://issues.apache.org/jira/browse/OAK-7852
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>            Priority: Major
>             Fix For: 1.10
> When the {{FileStore background task}} fails (e.g. because of a deadlock) and the {{FileStore}}
is subsequently shutdown in an unclean way ({{kill -9}}) then there is a risk of a sever data
loss. Although a journal could be reconstructed from the segments, there is a chance that
most if not all of the revisions written since the failure of the background tasks are inconsistent
with a {{SNFE}}. 
> The expectation for such a case should be that a journal could be reconstructed from
the segments and that all but the last few revisions are consistent.

This message was sent by Atlassian JIRA

View raw message