flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhu Zhu (Jira)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-15499) No debug log describes the host of a TM before any task is deployed to it in YARN mode
Date Tue, 07 Jan 2020 11:12:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-15499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhu Zhu updated FLINK-15499:
----------------------------
    Description: 
When troubleshooting FLINK-15456, I noticed a TM hang in starting and not able to register
to RM. However, there is no debug log on which host the TM located on and thus I can hardly
find the logs of the problematic TM.
I think we should print the host name when starting a TM, i.e. in this logs
"TaskExecutor container_XXXX will be started ...".
This would make it possible for us to troubleshoot similar problems. (not only for cases that
TM hang in starting, but also for cases that TM exits in starting)

  was:
When troubleshooting FLINK-15456, I noticed a TM hang in starting and not able to register
to RM. However, there is no info on which host the TM located on and thus we can hardly find
the logs of the problematic TM.
I think we should print the host name when starting a TM, i.e. in this logs
"TaskExecutor container_XXXX will be started ...".
This would make it possible for us to troubleshoot similar problems. (not only for cases that
TM hang in starting, but also for cases that TM exits in starting)


> No debug log describes the host of a TM before any task is deployed to it  in YARN mode

> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-15499
>                 URL: https://issues.apache.org/jira/browse/FLINK-15499
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>    Affects Versions: 1.10.0
>            Reporter: Zhu Zhu
>            Priority: Major
>
> When troubleshooting FLINK-15456, I noticed a TM hang in starting and not able to register
to RM. However, there is no debug log on which host the TM located on and thus I can hardly
find the logs of the problematic TM.
> I think we should print the host name when starting a TM, i.e. in this logs
> "TaskExecutor container_XXXX will be started ...".
> This would make it possible for us to troubleshoot similar problems. (not only for cases
that TM hang in starting, but also for cases that TM exits in starting)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message