flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Latta <mla...@technomage.com>
Subject Re: KeyedSream question
Date Thu, 05 Apr 2018 12:08:58 GMT
Thanks for the clarification. I was just trying to understand the intended behavior. It would
have been nice if Flink tracked state for downstream operators by key, but I can do that with
a map in the downstream functions. 

Michael

Sent from my iPad

> On Apr 5, 2018, at 2:30 AM, Fabian Hueske <fhueske@gmail.com> wrote:
> 
> Amit is correct. keyBy() ensures that all records with the same key are processed by
the same paralllel instance of a function.
> This is different from "a parallel instance only sees records of one key".
> 
> I had a look at the docs [1]. 
> I agree that "Logically partitions a stream into disjoint partitions, each partition
containing elements of the same key." can be easily interpreted as you did.
> I've pushed a commit to clarify the description. The docs should be updated soon.
> 
> Best, Fabian 
> 
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/#datastream-transformations
> 
> 2018-04-05 6:21 GMT+02:00 Amit Jain <aj2011it@gmail.com>:
>> Hi,
>> 
>> KeyBy operation partition the data on given key and make sure same slot will
>> get all future data belonging to same key. In default implementation, it can
>> also map subset of keys in your DataStream to same slot.
>> 
>> Assuming you have number of keys equal to number running slot then you may
>> specify your custom keyBy operation to the achieve the same.
>> 
>> 
>> Could you specify your case.
>> 
>> --
>> Thanks
>> Amit
>> 
>> 
>> 
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> 

Mime
View raw message