jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Mari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-6678) Syncing big blobs fails since StandbyServer sends persisted head
Date Wed, 27 Sep 2017 13:35:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-6678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182563#comment-16182563
] 

Francesco Mari commented on OAK-6678:
-------------------------------------

[~dulceanu], thanks for your updated patch. I'm not convinced that this is the best solution.
The timeout when reading the head state must not come from the standby. We risk unwanted denial
of service on the primary if a standby specifies too high timeouts in the requests. What I
think is preferable is to modify the {{GetRequestHandler}} to something like that:

{noformat}
class DefaultStandbyHeadReader implements StandbyHeadReader {

    private final FileStore store;

    private final long timeout;

    DefaultStandbyHeadReader(FileStore store, long timeout) {
        this.store = store;
        this.timeout = timeout;
    }

    @Override
    public String readHeadRecordId() {
        RecordId persistedHead = readPersistedHeadWithRetry(store, timeout);
        return persistedHead != null ? persistedHead.toString() : null;
    }

}
{noformat}

The timeout should be part of the configuration of the {{StandbyServer}}, which propagates
it to the {{DefaultStandbyHeadReader}} when the server pipeline is created. If a persisted
head state is not found, the primary should return a timely response to the client. That is,
the {{GetHeadRequestHandler}} should be modified to do the following.

{noformat}
String id = reader.readHeadRecordId();

if (id == null) {
    ctx.writeAndFlush(new NotFoundGetHeadResponse(msg.getClientId(), id));
    return;
}

ctx.writeAndFlush(new GetHeadResponse(msg.getClientId(), id));
{noformat}

This way, if the timeout in {{DefaultStandbyHeadReader}} is substantially less than the read
timeout used by the client, we should be able to gracefully handle the absence of a persisted
head state both on the server and the client. For example, the timeout in {{DefaultStandbyHeadReader}}
could be about 1s, while the default value of the read timeout on the client is 60s. The 1s
on the server, given the implementation in the patch, translates to eight consecutive attempts
at reading the head state IIUC.

What do you think about that?

> Syncing big blobs fails since StandbyServer sends persisted head
> ----------------------------------------------------------------
>
>                 Key: OAK-6678
>                 URL: https://issues.apache.org/jira/browse/OAK-6678
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar, tarmk-standby
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>              Labels: cold-standby, resilience
>             Fix For: 1.8, 1.7.9
>
>         Attachments: OAK-6678-02.patch, OAK-6678.patch
>
>
> With changes for OAK-6653 in place, {{ExternalPrivateStoreIT#testSyncBigBlog}} and sometimes
{{ExternalSharedStoreIT#testSyncBigBlob}} are failing on CI:
> {noformat}
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)  Time
elapsed: 96.82 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : {
} }>
> ...
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)  Time
elapsed: 95.254 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : {
} }>
> {noformat}
> Partial stacktrace:
> {noformat}
> 14:09:08.355 DEBUG [main] StandbyServer.java:242            Binding was successful
> 14:09:08.358 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request from client
Bar for current head
> 14:09:08.359 DEBUG [primary-1] ClientFilterHandler.java:53  Client /127.0.0.1:52988 is
allowed
> 14:09:08.360 DEBUG [primary-1] RequestDecoder.java:42       Parsed 'get head' message
> 14:09:08.360 DEBUG [primary-1] CommunicationObserver.java:79 Message 'get head' received
from client Bar
> 14:09:08.362 DEBUG [primary-1] GetHeadRequestHandler.java:43 Reading head for client
Bar
> 14:09:08.363 WARN  [primary-1] ExceptionHandler.java:31     Exception caught on the server
> java.lang.NullPointerException: null
> 	at org.apache.jackrabbit.oak.segment.standby.server.DefaultStandbyHeadReader.readHeadRecordId(DefaultStandbyHeadReader.java:32)
~[oak-segment-tar-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> 	at org.apache.jackrabbit.oak.segment.standby.server.GetHeadRequestHandler.channelRead0(GetHeadRequestHandler.java:45)
~[oak-segment-tar-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message