flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Congxian Qiu(klion26) (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-14340) Specify an unique DFSClient name for Hadoop FileSystem
Date Tue, 08 Oct 2019 04:20:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946472#comment-16946472

Congxian Qiu(klion26) commented on FLINK-14340:

[~sewen] as you introduced the HadoopFsFactory, what do you think about this issue, if this
is valid, could you please assign this ticket to me?

cc [~wangyang]

> Specify an unique DFSClient name for Hadoop FileSystem
> ------------------------------------------------------
>                 Key: FLINK-14340
>                 URL: https://issues.apache.org/jira/browse/FLINK-14340
>             Project: Flink
>          Issue Type: Improvement
>          Components: FileSystems
>            Reporter: Congxian Qiu(klion26)
>            Priority: Major
>             Fix For: 1.10.0
> Currently, when Flink read/write to HDFS, we do not set the DFSClient name for all the
connections, so we can’t distinguish the connections, and can’t find the specific Job
or TM quickly.
> This issue wants to add the {{container_id}} as a unique name when init Hadoop File
System, so we can easily distinguish the connections belongs to which Job/TM.
> Core changes is add a line such as below in {{org.apache.flink.runtime.fs.hdfs.HadoopFsFactory#create}}
> {code:java}
> hadoopConfig.set(“mapreduce.task.attempt.id”, System.getenv().getOrDefault(CONTAINER_KEY_IN_ENV,
> Currently, In {{YarnResourceManager}} and {{MesosResourceManager}} we both have an enviroment
key {{ENV_FLINK_CONTAINER_ID = "_FLINK_CONTAINER_ID"}}, so maybe we should introduce this
key in {{StandaloneResourceManager}}.

This message was sent by Atlassian Jira

View raw message