flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elias Levy (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-8358) Hostname used by DataDog metric reporter is not configurable
Date Thu, 04 Jan 2018 00:26:00 GMT
Elias Levy created FLINK-8358:
---------------------------------

             Summary: Hostname used by DataDog metric reporter is not configurable
                 Key: FLINK-8358
                 URL: https://issues.apache.org/jira/browse/FLINK-8358
             Project: Flink
          Issue Type: Bug
          Components: Metrics
    Affects Versions: 1.4.0
            Reporter: Elias Levy


The hostname used by the DataDog metric reporter to report metrics is not configurable.  This
can problematic if the hostname that Flink uses is different from the hostname used by the
system's DataDog agent.  

For instance, in our environment we use Chef, and using the DataDog Chef Handler, certain
metadata such a host roles is associated with the hostname in the DataDog service.  The hostname
used to submit this metadata is the name we have given the host.  But as Flink picks up the
default name given by EC2 to the instance, metrics submitted by Flink to DataDog using that
hostname are not associated with the tags derived from Chef.

In the Job Manager we can avoid this issue by explicitly setting the config {{jobmanager.rpc.address}}
to the hostname we desire.  I attempted to do the name on the Task Manager by setting the
{{taskmanager.hostname}} config, but DataDog does not seem to pick up that value.

Digging through the code it seem the DD metric reporter get the hostname from the {{TaskManagerMetricGroup}}
host variable, which seems to be set from {{taskManagerLocation.getHostname}}.  That in turn
seems to be by calling {{this.inetAddress.getCanonicalHostName()}}, which merely perform a
reverse lookup on the IP address, and then calling {{NetUtils.getHostnameFromFQDN}} on the
result.  The later is further problematic because it result is a non-fully qualified hostname.

More generally, there seems to be a need to specify the hostname of a JM or TM node that be
reused across Flink components.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message