flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maximilian Michels (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5081) unable to set yarn.maximum-failed-containers with flink one-time YARN setup
Date Mon, 21 Nov 2016 10:06:01 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683082#comment-15683082

Maximilian Michels commented on FLINK-5081:

I've had a second look. The issue is not that the configuration is not loaded. Moreover, your
finding reveals at least two other issues with our per-job YARN implementation:

1. When executing in non-detached job submission mode, the "Client Shutdown Hook" shuts down
the Yarn application in case of job failures (e.g. TaskManager dies). We should remove the
shutdown hook. It should only be active during deployment.

2. The per-job Yarn application is supposed to automatically shut down the cluster after job
completion. In case of failures (e.g. TaskManager dies) the shutdown apparently is performed
as well although it shouldn't.

> unable to set yarn.maximum-failed-containers with flink one-time YARN setup
> ---------------------------------------------------------------------------
>                 Key: FLINK-5081
>                 URL: https://issues.apache.org/jira/browse/FLINK-5081
>             Project: Flink
>          Issue Type: Bug
>          Components: Startup Shell Scripts
>    Affects Versions: 1.1.4
>            Reporter: Nico Kruber
> When letting flink setup YARN for a one-time job, it apparently does not deliver the
{{yarn.maximum-failed-containers}} parameter to YARN as the {{yarn-session.sh}} script does.
Adding it to conf/flink-conf.yaml as 
> https://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html#recovery-behavior-of-flink-on-yarn
suggested also does not work.
> example:
> {code:none}
> flink run -m yarn-cluster -yn 3 -yjm 1024 -ytm 4096 <job>.jar --parallelism 3 -Dyarn.maximum-failed-containers=100
> {code}

This message was sent by Atlassian JIRA

View raw message