flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Richter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4142) Recovery problem in HA on Hadoop Yarn 2.4.1
Date Fri, 01 Jul 2016 14:47:10 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359073#comment-15359073
] 

Stefan Richter commented on FLINK-4142:
---------------------------------------

I have a log for the problem here: https://storage.googleapis.com/srichter/task_mgr_restart_endless.log

> Recovery problem in HA on Hadoop Yarn 2.4.1
> -------------------------------------------
>
>                 Key: FLINK-4142
>                 URL: https://issues.apache.org/jira/browse/FLINK-4142
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN Client
>    Affects Versions: 1.0.3
>            Reporter: Stefan Richter
>
> On Hadoop Yarn 2.4.1, recovery in HA fails in the following scenario:
> 1) Kill application master, let it recover normally.
> 2) After that, kill a task manager.
> Now, Yarn tries to restart the killed task manager in an endless loop. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message