flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <ches...@apache.org>
Subject Re: Flink metrics related problems/questions
Date Fri, 19 May 2017 09:21:56 GMT
1. This shouldn't happen. Do you access the counter from different threads?

2. Metrics in general are not persisted across restarts, and there is no 
way to configure flink to do so at the moment.

3. Counters are sent as gauges since as far as I know StatsD counters 
are not allowed to be decremented.

On 19.05.2017 08:56, jaxbihani wrote:
> Background: We are using a job using ProcessFunction which reads data from
> kafka fires ~5-10K timers per second and sends matched events to KafkaSink.
> We are collecting metrics for collecting no of active timers, no of timers
> scheduled etc. We use statsd reporter and monitor using Grafana dashboard &
> RocksDBStateBackend backed by HDFS as state.
>
> Observations/Problems:
> 1. *Counter value suddenly got reset:*  While job was running fine, on one
> fine moment, metric of a monotonically increasing counter (Counter where we
> just used inc() operation) suddenly became 0 and then resumed from there
> onwards. Only exception in the logs were related to transient connectivity
> issues to datanodes. Also there was no other indicator of any failure
> observed after inspecting system metrics/checkpoint metrics.  It happened
> just once across multiple runs of a same job.
> 2. *Counters not retained during flink restart with savepoint*: Cancelled
> job with -s option taking savepoint and then restarted the job using the
> savepoint.  After restart metrics started from 0. I was expecting metric
> value of a given operator would also be part of state.
> 3. *Counter metrics getting sent as Gauge*: Using tcpdump I was inspecting
> the format in which metric are sent to statsd. I observed that even the
> metric which in my code were counters, were sent as gauges. I didn't get why
> that was so.
>
> Can anyone please add more insights into why above mentioned behaviors would
> have happened?
> Also does flink store metric values as a part of state for stateful
> operators? Is there any way to configure that?
>
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-metrics-related-problems-questions-tp13218.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
>


Mime
View raw message