flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8066) Changed configuration of taskmanagers should recreate them
Date Tue, 14 Nov 2017 11:21:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251258#comment-16251258

Till Rohrmann commented on FLINK-8066:

In Flip-6 this will work a bit differently since idling TMs will be released after a timeout.
Thus, changing the configuration should work.

For Yarn I'm not sure whether you can easily redeploy the JM because it runs inside of the
application master. Thus, I think it would be fine to apply the changes to {{MesosFlinkResourceManager#recoverWorkers}}
for the current code.

> Changed configuration of taskmanagers should recreate them
> ----------------------------------------------------------
>                 Key: FLINK-8066
>                 URL: https://issues.apache.org/jira/browse/FLINK-8066
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Stephen Gran
>            Priority: Minor
> When we redeploy the jobmanager to our mesos cluster with changed parameters affecting
the taskmanagers (eg, change from 1 CPU per TM to 2 CPUs per TM), the existing taskmanagers
are reused rather than replaced with new taskmanagers with new parameters.
> It seems like *recoverWorkers* in *org.apache.flink.mesos.runtime.clusterframework.MesosFlinkResourceManager*
has most of the information it would need to be able to perform this convergence, and it doesn't
seem like a large amount of work to do the check.
> My concern with starting to work on the issue there is that there may be a higher level,
perhaps in *FlinkResourceManager* that should perform this work on both mesos and yarn.  The
two implementations look quite different, however, so this may be an over eager optimisation
best left for later.  I'm happy to look at a patch for this, but I wanted some input before
starting the work to see where you thought this should live.

This message was sent by Atlassian JIRA

View raw message