flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5073) ZooKeeperCompleteCheckpointStore executes blocking delete operation in ZooKeeper client thread
Date Wed, 16 Nov 2016 08:07:59 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15669777#comment-15669777
] 

ASF GitHub Bot commented on FLINK-5073:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/2816

    [backport] [FLINK-5073] Use Executor to run ZooKeeper callbacks in ZooKeeperStateHandleStore

    Backport of #2815 for the release-1.1 branch.
    
    Use dedicated Executor to run ZooKeeper callbacks in ZooKeeperStateHandleStore instead
    of running it in the ZooKeeper client's thread. The callback can be blocking because it
    discards state which might entail deleting files from disk.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink backportFixZooKeeperDelete

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2816.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2816
    
----
commit ae67fe9a3bbc768911b8eab8dc32d18c2cb10c1a
Author: Till Rohrmann <trohrmann@apache.org>
Date:   2016-11-15T21:45:04Z

    [FLINK-5073] Use Executor to run ZooKeeper callbacks in ZooKeeperStateHandleStore
    
    Use dedicated Executor to run ZooKeeper callbacks in ZooKeeperStateHandleStore instead
    of running it in the ZooKeeper client's thread. The callback can be blocking because it
    discards state which might entail deleting files from disk.
    
    Add TestExecutors

----


> ZooKeeperCompleteCheckpointStore executes blocking delete operation in ZooKeeper client
thread
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-5073
>                 URL: https://issues.apache.org/jira/browse/FLINK-5073
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>             Fix For: 1.2.0, 1.1.4
>
>
> When deleting completed checkpoints from the {{ZooKeeperCompletedCheckpointStore}}, one
first tries to delete the meta state handle from ZooKeeper and then deletes the actual checkpoint
in a callback from the delete operation. This callback is executed by the ZooKeeper client's
main thread which is problematic, because it blocks the ZooKeeper client. If a delete operation
takes longer than it takes to complete a checkpoint, then it might even happen that delete
operations of outdated checkpoints are piling up because they are effectively executed sequentially.
> I propose to execute the delete operations by a dedicated {{Executor}} so that we keep
the client's main thread free to do ZooKeeper related work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message