flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2808) Rework / Extend the StatehandleProvider
Date Thu, 08 Oct 2015 16:08:26 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948915#comment-14948915

ASF GitHub Bot commented on FLINK-2808:

Github user gyfora commented on the pull request:

    As for the context, I did not mean to add it to the Checkpointed interface but to the
state backend method calls. 
    Other simpler solution for now would be to make the environment or the runtimecontext
accessible from the backend, so it knows which task it does the checkpointing for. This is
probably the cleanest solution for now until we figure out the exact requirements.
    Otherwise :+1: from me. 

> Rework / Extend the StatehandleProvider
> ---------------------------------------
>                 Key: FLINK-2808
>                 URL: https://issues.apache.org/jira/browse/FLINK-2808
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 0.10
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 0.10
> I would like to make some changes (mostly additions) to the {{StateHandleProvider}}.
Ideally for the upcoming release, as it is somewhat part of the public API.
> The rational behind this is to handle in a nice and extensible way the creation of key/value
state backed by various implementations (FS, distributed KV store, local KV store with FS
backup, ...) and various checkpointing ways (full dump, append, incremental keys, ...)
> The changes would concretely be:
> 1.  There should be a default {{StateHandleProvider}} set on the execution environment.
Functions can later specify the {{StateHandleProvider}} when grabbing the {{StreamOperatorState}}
from the runtime context (plus optionally a {{Checkpointer}})
> 2.  The {{StreamOperatorState}} is created from the {{StateHandleProvider}}. That way,
a KeyValueStore state backend can create a {{StreamOperatorState}} that directly updates data
in the KV store on every access, if that is desired (and filter accesses by timestamps to
only show committed data)
> 3.  The StateHandleProvider should have methods to get an output stream that writes to
the state checkpoint directly (and returns a StateHandle upon closing). That way we can convert
and dump large state into the checkpoint without crating a full copy in memory before.
> Lastly, I would like to change some names
>   - {{StateHandleProvider}} to either {{StateBackend}}, {{StateStore}}, or {{StateProvider}}
(simpler name).
>   - {{StreamOperatorState}} to either {{State}} or {{KVState}}.

This message was sent by Atlassian JIRA

View raw message