flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1908) JobManager startup delay isn't considered when using start-cluster.sh script
Date Tue, 21 Apr 2015 19:44:59 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505600#comment-14505600

ASF GitHub Bot commented on FLINK-1908:

Github user StephanEwen commented on the pull request:

    Forwarding comments from JIRA:
    I think @DarkKnightCZ is using versiob 0.8.x and Till Rohrmann is talking about 0.9
    The startup is handled very differently in 0.9 and should actually fix the issue. The
selection of the communication interface is in a backoff loop and should happen for many minutes
before the TaskManager falls back to heuristics.
    I don't think that this issue will be fixed in 0.8.x.
    @DarkKnightCZ Can you verify whether 0.9 works for you?

> JobManager startup delay isn't considered when using start-cluster.sh script
> ----------------------------------------------------------------------------
>                 Key: FLINK-1908
>                 URL: https://issues.apache.org/jira/browse/FLINK-1908
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Runtime
>    Affects Versions: 0.9, 0.8.1
>         Environment: Linux
>            Reporter: Lukas Raska
>            Priority: Minor
>   Original Estimate: 5m
>  Remaining Estimate: 5m
> When starting Flink cluster via start-cluster.sh script, JobManager startup can be delayed
(as it's started asynchronously), which can result in failed startup of several task managers.
> Solution is to wait certain amount of time and periodically check if RPC port is accessible,
then proceed with starting task managers.

This message was sent by Atlassian JIRA

View raw message