flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "au.fp2018" <au.fp2...@gmail.com>
Subject Multiple (non-consecutive) keyBy operators in a dataflow
Date Tue, 03 Apr 2018 00:30:53 GMT
Hello Flink Community,

I am relatively new to Flink. In the project I am currently working on I've
a dataflow with a keyBy() operator, which I want to convert to dataflow with
multiple keyBy() operators like this:

  Source -->
  KeyBy() -->
  Stateful process() function that generates a more granular key -->
  KeyBy(<id generated in the previous step>) -->
  More stateful computation(s) -->

Are there any downsides to this approach?
My reasoning behind the second keyBy() is to reduce the amount of state and
hence improve the processing speed.


Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

View raw message