flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ChrisChinchilla <...@git.apache.org>
Subject [GitHub] flink pull request #4543: [FLINK-7449] [docs] Additional documentation for i...
Date Fri, 24 Nov 2017 23:37:42 GMT
Github user ChrisChinchilla commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4543#discussion_r153033430
  
    --- Diff: docs/ops/state/checkpoints.md ---
    @@ -99,3 +99,296 @@ above).
     ```sh
     $ bin/flink run -s :checkpointMetaDataPath [:runArgs]
     ```
    +
    +## Incremental Checkpoints
    +
    +### Synopsis
    +
    +Incremental checkpoints can significantly reduce checkpointing time in comparison to
full checkpoints, at the cost of a
    +(potentially) longer recovery time. The core idea is that incremental checkpoints only
record changes in state since the
    +previously-completed checkpoint instead of producing a full, self-contained backup of
the state backend. In this way,
    +incremental checkpoints can build upon previous checkpoints.
    +
    +RocksDBStateBackend is currently the only backend that supports incremental checkpoints.
    +
    +Flink leverages RocksDB's internal backup mechanism in a way that is self-consolidating
over time. As a result, the
    +incremental checkpoint history in Flink does not grow indefinitely, and old checkpoints
are eventually subsumed and
    +pruned automatically.
    +
    +``While we strongly encourage the use of incremental checkpoints for Flink jobs with
large state, please note that this is
    +a new feature and currently not enabled by default``.
    +
    +To enable this feature, users can instantiate a `RocksDBStateBackend` with the corresponding
boolean flag in the
    +constructor set to `true`, e.g.:
    +
    +```java
    +   RocksDBStateBackend backend =
    +       new RocksDBStateBackend(filebackend, true);
    +```
    +
    +### Use-case for Incremental Checkpoints
    +
    +Checkpoints are the centrepiece of Flink’s fault tolerance mechanism and each checkpoint
represents a consistent
    +snapshot of the distributed state of a Flink job from which the system can recover in
case of a software or machine
    +failure (see [here]({{ site.baseurl }}/internals/stream_checkpointing.html). 
    +
    +Flink creates checkpoints periodically to track the progress of a job so that, in case
of failure, only those
    +(hopefully few) *events that have been processed after the last completed checkpoint*
must be reprocessed from the data
    +source. The number of events that must be reprocessed has implications for recovery time,
and so for fastest recovery,
    +we want to *take checkpoints as often as possible*.
    +
    +However, checkpoints are not without performance cost and can introduce *considerable
overhead* to the system. This
    +overhead can lead to lower throughput and higher latency during the time that checkpoints
are created. One reason is
    +that, traditionally, each checkpoint in Flink always represented the *complete state*
of the job at the time of the
    +checkpoint, and all of the state had to be written to stable storage (typically some
distributed file system) for every
    +single checkpoint. Writing multiple terabytes (or more) of state data for each checkpoint
can obviously create
    +significant load for the I/O and network subsystems, on top of the normal load from pipeline’s
data processing work.
    +
    +Before incremental checkpoints, users were stuck with a suboptimal tradeoff between recovery
time and checkpointing
    +overhead. Fast recovery and low checkpointing overhead were conflicting goals. And this
is exactly the problem that
    +incremental checkpoints solve.
    +
    +
    +### Basics of Incremental Checkpoints
    +
    +In this section, for the sake of getting the concept across, we will briefly discuss
the idea behind incremental
    +checkpoints in a simplified manner.
    +
    +Our motivation for incremental checkpointing stemmed from the observation that it is
often wasteful to write the full
    +state of a job for every single checkpoint. In most cases, the state between two checkpoints
is not drastically
    +different, and only a fraction of the state data is modified and some new data added.
Instead of writing the full state
    +into each checkpoint again and again, we could record only changes in state since the
previous checkpoint. As long as we
    --- End diff --
    
    @StefanRRichter I'm struggling to understand this sentence.
    
    > As long as we have the previous checkpoint and the state changes for the current
checkpoint, we can restore the full, current state for the job.
    
    Do you mean…
    
    > As long as we have the previous checkpoint, if the state changes for the current
checkpoint, we can restore the full, current state for the job.
    
    Or something different? Explain to me what you're trying to say :)


---

Mime
View raw message