flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Knauf <konstantin.kn...@tngtech.com>
Subject Re: Referencing Global Window across flink jobs
Date Sun, 09 Jul 2017 19:14:25 GMT
Hi Vijay,

thanks for sharing the code. To my knowledge the only way to access the
state of one job in another job right now is Queryable State, which in
this case seems impractical. Why do you want to perform the apply
functions in separate Flink jobs?

In the same job I would just perform all aggregations within one
WindowFunction emitting a Tuple/POJO with all the aggregations. You can
then use a map to project the stream of all aggregations to its
dimensions. This way you only keep the window state once, opposed to
calling WindowedStream::apply multiple times on the same windowed
stream. In case you want to decouple the downstream operations on the
different aggregations from each other, you can still write the
different dimensions of the output of the WindowFunction to different
Kafka Topics and have separate jobs from there on.

Cheers,

Konstantin

On 07.07.2017 12:06, G.S.Vijay Raajaa wrote:
> HI Konstantin,
> 
> Please find a snippet of my code:
> 
>   DataStream < String > stream = env
> 
>    .addSource(new FlinkKafkaConsumer08 < > ("data", new
> SimpleStringSchema(), properties));
> 
>   
> 
>   // Create a keyed stream from the kafka data stream
> 
>   KeyedStream<Tuple2<Integer, JsonObject>, Tuple> pojo = 
> 
>   stream.map(new JsonDeserializer()).
> 
>   keyBy(0);
> 
>   
> 
>   // Create a global window to extend the window throughout the day
> 
>  
> pojo.window(GlobalWindows.create()).trigger(MyTrigger.of(10,4000))*.apply(new
> JsonMerger()).*
> 
> *
> *
> 
> In the above snippet the Global Window keeps on growing and trigger
> fires  the apply function for every addition of a record to the window.
> The final purge happens when the max count is met. Now the idea is I am
> exploring if I could reference the state and trigger of the global
> function across flink jobs and perform apply functions parallely. The
> source for all the flink jobs is the same window of data. The idea is
> that, the parallel flink jobs wont hook up to the stream source but get
> triggered based on the global window state and trigger event. Hope it
> explains the scenario. Please excuse if I am not able to detail the
> nitty gritties to the most granular unit possible.
> 
> Regards,
> 
> Vijay Raajaa GS 
> 
> 
> On Fri, Jul 7, 2017 at 3:17 PM, Konstantin Knauf
> <konstantin.knauf@tngtech.com <mailto:konstantin.knauf@tngtech.com>> wrote:
> 
>     Hi Vijay,
> 
>     can you elaborate a little bit on what you would like to achieve?
>     Right now, I am not sure what aspect of the window you want to
>     reference (WindowState,Timers, State in the Windowfunction,...).
> 
>     Cheers,
> 
>     Konstantin
> 
>     sent from my phone. Plz excuse brevity and tpyos.
>     ---
>     Konstantin Knauf *konstantin.knauf@tngtech.com
>     <mailto:konstantin.knauf@tngtech.com> * +49-174-3413182
>     <tel:+49-174-3413182>
>     TNG Technology Consulting GmbH, Betastr. 13a, 85774
>     <tel:85774> Unterföhring
>     Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> 
>     ---- G.S.Vijay Raajaa schrieb ----
> 
> 
>     HI,
> 
>     I have a use case were I need to build a global window with custom
>     trigger. I would like to reference this window across my flink jobs.
>     Is there a possibility that the global window can be referenced?
> 
>     Regards,
>     Vijay Raajaa GS 
> 
> 

-- 
Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Mime
View raw message