flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Consistent (hashing) keyBy over multiple time or streaming windows
Date Mon, 02 Nov 2015 09:25:51 GMT
Hi Leonard,
I’m afraid you might be thinking about windows as they are supported by Spark Streaming.
There windows are quite limited. In Flink you don’t necessarily have to window elements
by time since Flink does not collect data in mini-batches before processing. Everything is
continuously processed and you can have arbitrary Trigger strategies that decide when you
want to process windows.

The basic idea behind windowing in Flink is that elements are assigned to windows by a WindowAssigner
and then a Trigger decides when to trigger computation for a specific window. This is very
similar to the model employed by Google Cloud Dataflow, if you are familiar with that.

You could have a look at this Stackoverflow question and my answer to it: http://stackoverflow.com/questions/33451121/apache-flink-session-support.
It could be similar to your use case.

Please let me know if you want a more in-depth explanation of the windowing system. It is
a quite recent addition and arguably the most complex part in any streaming system.

Cheers,
Aljoscha
> On 02 Nov 2015, at 09:15, Leonard Wolters <leonard@sagent.io> wrote:
> 
> Hi,
> 
> I was wondering if Flink already has implemented some sort of consistent keyBy mapping
over multiple windows.
> The underlying idea is to 'sessionize' incoming events over time (i.e. multiple streaming
windows) on the same
> partitions. As one can understand I want to avoid heavy shuffling over the network.
> 
> As far as I could read / understand from the API docs and the blogs on data-artisans,
I can perfectly groupBy events
> within one (time) window that are automatically distributed over all partitions. However,
since sessions often exceed
> the window submit duration, I want some guarantees that events belonging to the same
session are delivered to the 
> same partition for new windows. 
> 
> Is this possible or are they any plans to support this soon?
> 
> Thanks in advance,
> 
> Leonard
> 
> -- 
> Leonard Wolters
> Chief Product Manager
> <logo.png>
> M: +31 (0)6 55 53 04 01 | T: +31 (0)88 10 44 555 
> E: leonard@sagent.io | W: sagent.io | Disclaimer | Sagent BV 
> Herengracht 504 | 1017CB Amsterdam | Netherlands


Mime
View raw message