ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Vinogradov ...@apache.org>
Subject Crash recovery speed-up #3, Cellular Switch
Date Wed, 06 May 2020 10:24:01 GMT

PME-free switch [1] (since 2.8) skips PME on node left when possible
(baseline + fully rebalanced cluster).
This means we already wait for nothing (except recovery) to perform the
This optimization allows continuing already started operations during or
after the switch if they are not affected by failed primary.
But upcoming operations still can't be started until the switch is finished

Let me propose an additional optimization - Cellular switch.
Cellular Affinity [2] means that nodes combined into virtual cells where,
for each partition, backups located at the same cell with primaries.
The simplest way to gain Cellular Affinity is to use backup filters [3].

Cellular Affinity allows to finish the switch outside the affected cell
instantly with the following assumptions:
- Replicated caches should be recovered first since every node affected (as
a backup) by any failed primary.
  But, it is expected that replicated caches effectively read-only (has
extremely rare updates), so, nothing to wait here.
- Upcoming replicated transactions (with non-failed primaries) can be
started but can't be committed until switch finished cluster-wide.
- Upcoming transactions related to the broken cell will wait for cell
recovery (cluster-wide switch finish).

... and this means:
In addition to PME-free switch, where we able to continue already started
operations during or after the switch, now we also able to perform most of
the upcoming operations during the switch.

In other words, Cellular switch has little effect on the operation's
latency, when operation not related to the failed cell.

According to benchmark [4] which checks "how fast upcoming transactions
(started after switch start) can be committed when we have thousands of
prepared transactions (prepared before switch start)", we have 5326 ms [5]
operation's latency on master and 65 ms [6] with the proposed fix, which is
~100 times faster.

Fix [7] (as a part of IEP-45 [8]) ready to be reviewed.
Waiting for your review!

[7] https://issues.apache.org/jira/browse/IGNITE-12617

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message