flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "miki haiat (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
Date Sun, 02 Dec 2018 12:42:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16706252#comment-16706252
] 

miki haiat commented on FLINK-10775:
------------------------------------

I had this issue as well on 1.4.x .

I can confirm that on 1.5.5 and 1.6.x this issue is no longer exists 

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or
has not been restarted. Keeping it quarantined.
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-10775
>                 URL: https://issues.apache.org/jira/browse/FLINK-10775
>             Project: Flink
>          Issue Type: Bug
>          Components: ResourceManager
>    Affects Versions: 1.4.2
>         Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>            Reporter: ChuanHaiTan
>            Priority: Blocker
>              Labels: k8s+docker, usability
>         Attachments: logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt,
logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt,
微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager container
are the standalone cluster modes.
> {color:#FF0000}But for some reason, the jobmanager is rebooted, and two of the remaining
three taskmanger are also rebooted, and two of the remaining three taskmanger don't connect
to jobmanager, resulting in insufficient slot resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and all available
and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and one of the
three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message