flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: how to understand the flink flow control
Date Fri, 07 Aug 2015 13:07:05 GMT
Hey Zhijiang Wang,

I will update the docs next week with more information. The short version is that flow control
happens via the buffer pools that Flink uses for produced and consumed intermediate results.

The slightly ;) longer version:

Each task has buffer pools. The size of these buffer pools depends on multiple things (per
task manager):
- the configured number of network buffers [default 2048]
- the number of tasks running
- the number of consumed and produced outputs

Each consumed input (if you are looking into the code: each SingleInputGate) has a buffer
pool associated with it and each produced intermediate result (see IntermediateResultPartition
in the code) as well.

Each produced record is serialized into a buffer of the respective buffer pool of the produced
result partition and dispatched to the consumer (either local or remote via network). After
the buffer has been consumed it is recycled to the pool and can be used for the outstanding
records. If the producer is faster than the consumer, these buffer will take longer to be
available again and the producer will slow down by waiting on a buffer.

For local exchange the buffer is consumed as soon as the local consumer has deserialized the
records and for remote exchange as soon as the network layer has dispatched the buffer.

For the input side, there is a similar mechanism. The network layer receives a buffer and
copies it to the buffer pool of the respective input gate and queues the filled buffer to
the (remote) input channel. If there is no buffer available at the input gate, the TCP channel
is not read until a buffer is available again. This backpressures remote receivers, because
their output buffers are not dispatched and cannot be recycled.

I hope this helps. If you have further questions, just post them here. I will update the docs
with some figures, so this will be easier to follow.

– Ufuk

On 07 Aug 2015, at 03:37, wangzhijiang999 <wangzhijiang999@aliyun.com> wrote:

> As said in apache page, Flink's streaming runtime has natural flow control: Slow downstream
operators backpressure faster upstream operators.
> How to understand the flink natural flow control? 
> As i know, heron has the backpressure mechanism, if some tasks process slowly, it will
stop reading from source and notify other tasks to stop reading from source.
> In flink, if the producer task process quickly, it will emit the results to consumer.
So the buffer in InputChannel of consumer wil be filled up, if the consumer process slowly,
how to control the upstream flow?
> Thank you for any suggestions in advance!
> Best wishes,
> Zhijiang Wang

View raw message