flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5715) Asynchronous snapshotting for HeapKeyedStateBackend
Date Fri, 10 Mar 2017 12:45:05 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905007#comment-15905007

ASF GitHub Bot commented on FLINK-5715:

Github user StephanEwen commented on a diff in the pull request:

    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsStateBackend.java
    @@ -97,6 +100,27 @@ public FsStateBackend(String checkpointDataUri) throws IOException
     	 * @param checkpointDataUri The URI describing the filesystem (scheme and optionally
     	 *                          and the path to the checkpoint data directory.
    +	 * @param asynchronousSnapshots Switch to enable asynchronous snapshots.
    +	 *
    +	 * @throws IOException Thrown, if no file system can be found for the scheme in the
    +	 */
    +	public FsStateBackend(String checkpointDataUri, boolean asynchronousSnapshots) throws
IOException {
    --- End diff --
    We are getting one more parameter into the constructors with the change makes the state
backend handle all checkpoint/savepoint storage related business. That must be constructor
parameter, so if we can avoid further constructor parameters, that would help. Otherwise we
really end up with 20 constructors.

> Asynchronous snapshotting for HeapKeyedStateBackend
> ---------------------------------------------------
>                 Key: FLINK-5715
>                 URL: https://issues.apache.org/jira/browse/FLINK-5715
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
> Blocking snapshots render the HeapKeyedStateBackend practically unusable for many user
in productions. Their jobs can not tolerate stopped processing for the time it takes to write
gigabytes of data from memory to disk. Asynchronous snapshots would be a solution to this
problem. The challenge for the implementation is coming up with a copy-on-write scheme for
the in-memory hash maps that build the foundation of this backend. After taking a closer look,
this problem is twofold. First, providing CoW semantics for the hashmap itself, as a mutible
structure, thereby avoiding costly locking or blocking where possible. Second, CoW for the
mutable value objects, e.g. through cloning via serializers.  

This message was sent by Atlassian JIRA

View raw message