flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom Goong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-11742) Push metrics to Pushgateway without "instance"
Date Sun, 03 Mar 2019 05:03:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom Goong updated FLINK-11742:
------------------------------
    Description: 
According to the official article,

[https://prometheus.io/docs/concepts/jobs_instances/]

[https://github.com/prometheus/pushgateway]

when sending a metric to Prometheus Pushgateway, you need to give an "instance" message.
 In actual use, after there is no "instance", Prometheus stores metrics with problems, metrics
are not continuous, and a lot of data is lost. After adding instance, it returns to normal.

 

no "instance" 

!image-2019-02-25-17-16-28-618.png!

 

with "instance"

!image-2019-02-25-17-16-59-034.png!

 

 
{quote}In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding
to a single process. A collection of instances with the same purpose, a process replicated
for scalability or reliability for example, is called a job.
{quote}
{quote}For example, an API server job with four replicated instances:
job: api-server
-- instance 1: 1.2.3.4:5670
-- instance 2: 1.2.3.4:5671
-- instance 3: 5.6.7.8:5670
-- instance 4: 5.6.7.8:5671
{quote}
[https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances]

I think a Flink job corresponds to a Prometheus job, and taskmanager and jobmanager correspond
to different instances. If the jobName is used as the instance label, the same metrics of
different tasksmanages will conflict, and operations such as sum will fail.

  was:
According to the official article,

[https://prometheus.io/docs/concepts/jobs_instances/]

[https://github.com/prometheus/pushgateway]

when sending a metric to Prometheus Pushgateway, you need to give an "instance" message.
 In actual use, after there is no "instance", Prometheus stores metrics with problems, metrics
are not continuous, and a lot of data is lost. After adding instance, it returns to normal.

 

no "instance" 

!image-2019-02-25-17-16-28-618.png!

 

with "instance"

!image-2019-02-25-17-16-59-034.png!


> Push metrics to Pushgateway without "instance"
> ----------------------------------------------
>
>                 Key: FLINK-11742
>                 URL: https://issues.apache.org/jira/browse/FLINK-11742
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>            Reporter: Tom Goong
>            Assignee: Tom Goong
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2019-02-25-17-16-28-618.png, image-2019-02-25-17-16-59-034.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> According to the official article,
> [https://prometheus.io/docs/concepts/jobs_instances/]
> [https://github.com/prometheus/pushgateway]
> when sending a metric to Prometheus Pushgateway, you need to give an "instance" message.
>  In actual use, after there is no "instance", Prometheus stores metrics with problems,
metrics are not continuous, and a lot of data is lost. After adding instance, it returns to
normal.
>  
> no "instance" 
> !image-2019-02-25-17-16-28-618.png!
>  
> with "instance"
> !image-2019-02-25-17-16-59-034.png!
>  
>  
> {quote}In Prometheus terms, an endpoint you can scrape is called an instance, usually
corresponding to a single process. A collection of instances with the same purpose, a process
replicated for scalability or reliability for example, is called a job.
> {quote}
> {quote}For example, an API server job with four replicated instances:
> job: api-server
> -- instance 1: 1.2.3.4:5670
> -- instance 2: 1.2.3.4:5671
> -- instance 3: 5.6.7.8:5670
> -- instance 4: 5.6.7.8:5671
> {quote}
> [https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances]
> I think a Flink job corresponds to a Prometheus job, and taskmanager and jobmanager correspond
to different instances. If the jobName is used as the instance label, the same metrics of
different tasksmanages will conflict, and operations such as sum will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message