jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomek Rękawek (JIRA) <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (OAK-2106) Optimize reads from secondaries
Date Wed, 04 Nov 2015 09:19:27 GMT

     [ https://issues.apache.org/jira/browse/OAK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tomek Rękawek updated OAK-2106:
    Comment: was deleted

(was: {quote}Let's say the estimator measures a lag of 2 seconds at time T. That is, secondaries
have synced up to T-2s. At T+5s the secondaries still lag behind at T-2s.{quote}

Let's have S - secondary optime, P - primery optime, T - current time. The lag is measured
as S-P, not S-T. It should allow to avoid the case in which the lag is large, but we happen
to measure it right after some operation has been applied.

If we want to make it more reliable we can measure eg. 10 last values and return the largest

{quote}I'm also a bit concerned about introducing a dependency from MongoDocumentStore to
classes like UnmergedBranches and UnsavedModifications.
I would rather like to see a solution where the client of the DocumentStore can express how
fresh the document needs to be when it reads from the store.{quote}

It concerns me as well (as this is some kind of circular dependency), but I wasn't able to
find something better. The access to unmerged branches is necessary so we won't ask the secondary
about the path belonging to branch. It doesn't depend on the time, as user may modify many
nodes (which'll result in creating branch) and keep the changes unmerged for a very long time.

Situation looks a bit different with the UnsavedModifications, as they are saved on a regular
basis ({{asyncDelay}}) - we can add this value to the estimated lag to be sure that background
update thread has run and the changes has been replicated.

{quote}I would rather like to see a solution where the client of the DocumentStore can express
how fresh the document needs to be when it reads from the store. I think this also means the
decision whether a read can be directed to a secondary must not depend on the lag as a duration,
but should rather calculate a time when it is safe to read from a secondary.{quote}

We can take the {{find(maxCacheAge)}} parameter into consideration in the {{getMongoReadPreference}},
however it doesn't solve the issue with the unmerged branches.

{quote}The tricky part here is how to handle time differences on the machines where the Oak
cluster nodes are running and the MongoDB replica set. Each change on a document is associated
with a revision, where the timestamp of the revision is tied to the local clock where the
revision was created. The oplog timestamp on the other hand is derived from the primary replica
set member clock, I assume.{quote}

The replication set status is taken from the primary. For each secondary member we have 3
times available:

* optime - secondary time of the last operation applied,
* lastHeartbeat - secondary time of the last heartbeat sent,
* lastHeartbeatRecv - primary time of the last heartbeat received.

Primary member provides:

* optime,
* current timestamp.

As stated above, I estimate lag by subtracting primary optime from secondary optime. These
two times comes from different machines and therefore clock differences will make it less

The other way of measuring the lag would be comparing lastHeartBeatRecv and current time stamp.
These two times comes from the same machine (primary). It tells us how often the secondary
ask for changes, but not how long does it take to apply them. Maybe the first thing is more
important - if so, I can change the estimation method.)

> Optimize reads from secondaries
> -------------------------------
>                 Key: OAK-2106
>                 URL: https://issues.apache.org/jira/browse/OAK-2106
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>              Labels: performance, scalability
> OAK-1645 introduced support for reads from secondaries under certain
> conditions. The current implementation checks the _lastRev on a potentially
> cached parent document and reads from a secondary if it has not been
> modified in the last 6 hours. This timespan is somewhat arbitrary but
> reflects the assumption that the replication lag of a secondary shouldn't
> be more than 6 hours.
> This logic should be optimized to take the actual replication lag into
> account. MongoDB provides information about the replication lag with
> the command rs.status().

This message was sent by Atlassian JIRA

View raw message