flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruv Kumar <gargdhru...@gmail.com>
Subject Re: Measure End-to-End latency/delay for each record
Date Thu, 26 Apr 2018 18:38:48 GMT
What do you mean by the time skew from one machine(source) to another(sink)? Do you mean the
system time clocks of the source and sink may not be in sync. If I regularly use NTP to keep
the system clocks in sync, will time skew still happen?

Could you also elaborate on what do you mean by back pressure on source and how will it impact
the latency calculations?

Sorry if these are trivial questions. I am a bit new to the real world streaming systems.

--------------------------------------------------
Dhruv Kumar
PhD Candidate
Department of Computer Science and Engineering
University of Minnesota
www.dhruvkumar.me

> On Apr 26, 2018, at 13:26, TechnoMage <mlatta@technomage.com> wrote:
> 
> In a single machine system this may work ok.  In a multi-machine system this is not as
reliable as the time skew from one machine (source) to another (sink) can impact the measurements.
 This also does not account for back presure on the source.  We are using an external process
to in parallel read the source and output of the sink to measure the latency on a single system
clock.  It does account for those issues, but of course does not account for delivery delays
in the messaging system (kafka in our case).  But, does measure real world latency as seen
by the rest of the system which is ultimately what matters to us.
> 
> Michael
> 
>> On Apr 26, 2018, at 12:01 PM, Dhruv Kumar <gargdhruv36@gmail.com <mailto:gargdhruv36@gmail.com>>
wrote:
>> 
>> Hi
>> 
>> I was trying to compute the end-to-end-latency for each record processed by Flink.
By end-to-end latency, I mean the difference between the time at which the record entered
the Flink system (came at source) and the time at which the record is finally emitted into
the sink. What is the best way to measure this? I was thinking of doing the following:
>> 1. Add the current system timestamp to the record when the record arrives at Flink.
>> 2. Add the current system timestamp to the record when the record is finally being
emitted into the sink.
>> 3. Take the difference between 2 and 1 offline when all the records have been written
into the sink.
>> 
>> Does this sound ok?
>> 
>> Also, if I use Processing time characteristic for this end-to-end-latency, will it
be fine?
>> 
>> Thanks
>> --------------------------------------------------
>> Dhruv Kumar
>> PhD Candidate
>> Department of Computer Science and Engineering
>> University of Minnesota
>> www.dhruvkumar.me <http://www.dhruvkumar.me/>
> 


Mime
View raw message