flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yun Gao" <yungao...@aliyun.com>
Subject Re: Applying multiple calculation on data aggregated on window
Date Thu, 16 May 2019 10:24:59 GMT
Hi Soheil,

    Is it possible to first add an operator to preprocess the records to filter out unused
records and add a special operation id ? It may looks like

     raw..filter()  // Filter out e and g
        .map()  // Transform {ts: 1, key: a, value: 10} to {ts: 1, key: a, value: 10, op-id:
"1-avg"}
        .keyBy() // Key by the op-id
        .timeWindow(Time.seconds(5))
        .process()  // Process the window. The operation is able to be deduced from the operation
id.


Best,
Yun Gao


------------------------------------------------------------------
From:Soheil Pourbafrani <soheil.ir08@gmail.com>
Send Time:2019 May 16 (Thu.) 06:47
To:user <user@flink.apache.org>
Subject:Applying multiple calculation on data aggregated on window

Hi,

Im my environment I need to collect stream of messages into windows based on some fields as
key and then I need to do multiple calculations that will apply on specaified messages. for
example if i had the following messages on the window:
{ts: 1, key: a, value: 10}
{ts: 1, key: b, value: 0}
{ts: 1, key: c, value: 2}
{ts: 1, key: d, value: 5}
{ts: 1, key: e, value: 6}
{ts: 1, key: f, value: 7}
{ts: 1, key: g, value: 9}

- for the keys a, b and c I need to calculate the average of the values (12/3=4) and generate
another message like {ts: 1, key: abc, value: 4}

- for the key f and d I need to get the sum (5 + 7 = 12) and generate {ts: 1, key: fd, value:
12}

and I don't need the messages with the key e and g

So I did the following:
raw
  .keyBy(4, 5)
  .timeWindow(Time.seconds(5))
but I don't know how flink can help me to apply the logic to the data. I think I need to use
some method other than reduce or aggregate.

Any help will be appreciated.

thanks

Mime
View raw message