flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5040) Set correct input channel types with eager scheduling
Date Thu, 10 Nov 2016 16:33:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654482#comment-15654482
] 

ASF GitHub Bot commented on FLINK-5040:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/2783

    [FLINK-5040] [jobmanager] Set correct input channel types with eager scheduling

    When we do eager deployment all intermediate stream/partition locations are already known
when scheduling an intermediate stream/partition consumer. Nonetheless we saw tasks with "unknown
input channels" that were updated lazily during runtime. This was caused by a wrong producer
execution state check requiring the producers to be in RUNNING or DEPLOYING state when creating
consumer input channels. This is changed in the 2nd commit.
    
    The 1st commit revert a bogus fix as part of FLINK-3232. With that "fix" we actually did
not fix anything correctly and instead doubled the number of schedule or update consumer messages
we sent.
    
    Furthermore (3rd commit) we change the initial and max partition request back off to 100ms
and 10secs respectively. Those numbers were hard coded before. As a safety net for very slow
deployments, the values can be changed via the config. No user should need to change this
config value in practice.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink eager_deployment

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2783.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2783
    
----
commit bbbe8e9c19eb528e3e5d8e046e79298a300af556
Author: Ufuk Celebi <uce@apache.org>
Date:   2016-11-09T15:07:22Z

    Revert "[FLINK-3232] [runtime] Add option to eagerly deploy channels"
    
    The reverted commit did not really fix anything, but hid the problem by
    brute force, sending many more schedule or update consumers messages.

commit 70088f2acade2f20b8b75e18955f91793f7614c3
Author: Ufuk Celebi <uce@apache.org>
Date:   2016-11-09T17:25:06Z

    [FLINK-5040] [jobmanager] Set correct input channel types with eager scheduling

commit 9d186d9e42007f1144e64c802466befb858b7363
Author: Ufuk Celebi <uce@apache.org>
Date:   2016-11-10T10:15:47Z

    [FLINK-5040] [taskmanager] Adjust partition request backoffs
    
    The back offs were hard coded before, which would have made it
    impossible to react to any potential problems with them.

----


> Set correct input channel types with eager scheduling
> -----------------------------------------------------
>
>                 Key: FLINK-5040
>                 URL: https://issues.apache.org/jira/browse/FLINK-5040
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>             Fix For: 1.2.0, 1.1.4
>
>
> When we do eager deployment all intermediate stream/partition locations are already known
when scheduling an intermediate stream/partition consumer. Nonetheless we saw tasks with "unknown
input channels" that were updated lazily during runtime. This was caused by a wrong producer
execution state check requiring the producers to be in RUNNING or DEPLOYING state when creating
consumer input channels.
> (We had a bogus fix for this in FLINK-3232. With that "fix" we actually did not fix anything
correctly and instead doubled the number of schedule or update consumer messages we sent.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message