jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davide Giannella (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (OAK-6659) Cold standby should fail loudly when a big blob can't be timely transferred
Date Fri, 29 Sep 2017 10:10:10 GMT

     [ https://issues.apache.org/jira/browse/OAK-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Davide Giannella closed OAK-6659.

Bulk close for 1.7.8

> Cold standby should fail loudly when a big blob can't be timely transferred
> ---------------------------------------------------------------------------
>                 Key: OAK-6659
>                 URL: https://issues.apache.org/jira/browse/OAK-6659
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar, tarmk-standby
>    Affects Versions: 1.7.6
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Critical
>              Labels: cold-standby
>             Fix For: 1.7.8
>         Attachments: OAK-6659.patch
> Due to changes done in OAK-4969, currently there are two 'sync blob' cycles triggered
by {{StandbyDiff#childNodeChanged}}. The test scenario is the same as the one in {{DataStoreTestBase#testSyncBigBlob}}:
on the primary file store, a new big blob (1GB) is added and then a standby sync is triggered
to sync this content to the secondary file store. 
> The first 'sync blob' cycle happens as a result of {{#process}} being called in {{StandbyDiff#childNodeChanged}}.
Therefore, a new 'get blob' request is created on the client and the server starts sending
chunks from the big blob. Now, if the time needed for transferring the entire blob from server
to client exceeds {{readTimeoutMs}} an {{IllegalStateException}} will be correctly thrown
by {{StandbyDiff#readBlob}}, but will be swallowed by the {{StandbyDiff#childNodeChanged}}
in its catch clause. A second 'sync blob' cycle will be triggered and, -this might succeed
with the same {{readTimeoutMs}} for which it was failing before-, if {{readTimeoutMs * 2}}
is enough, the blob will be synced on the standby. This happens because the server will continue
sending the remaining chunks after {{IllegalStateException}} was thrown (first 'sync blob'
> The consequence of these two 'sync blob' cycles is that sometimes, deleting the temporary
file to which chunks are spooled to on the client fails (see Windows for example and OAK-6641
specifically). This way, instead of deleting the previous incomplete transfer, new chunks
from the second 'sync blob' cycle are added. The blob persisted in the blob store on the client
won't have the same size and id as the initial blob sent by the server.

This message was sent by Atlassian JIRA

View raw message