flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4141) TaskManager failures not always recover when killed during an ApplicationMaster failure in HA mode on Yarn
Date Fri, 01 Jul 2016 18:13:11 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359398#comment-15359398
] 

ASF GitHub Bot commented on FLINK-4141:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/2190


> TaskManager failures not always recover when killed during an ApplicationMaster failure
in HA mode on Yarn
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-4141
>                 URL: https://issues.apache.org/jira/browse/FLINK-4141
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.0.3
>            Reporter: Stefan Richter
>            Assignee: Maximilian Michels
>             Fix For: 1.1.0
>
>
> High availability on Yarn often fails to recover in the following test scenario:
> 1. Kill application master process.
> 2. Then, while application master is recovering, randomly kill several task managers
(with some delay).
> After the application master recovered, not all the killed task manager are brought back
and no further attempts are made the restart them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message