flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Simulating Time-based and Count-based Custom Windows with ProcessFunction
Date Mon, 16 Apr 2018 04:42:05 GMT
Hi Max!

Before we jump into the custom ProcessFunction approach:
Have you also checked out using the RocksDB state backend, and whether or not it is suitable
for your use case?
For state that would not fit into memory, that is usually the to-go state backend to use.

If you’re sure a custom ProcessFunction is still the way to go, then some answers to your
questions:

1 -- How do I assign timestamps into my data tuples? What type of 
time...process, event or ingestion? 

This should be done via the `.assignTimestampsAndWatermarks` method on the output of an operator
prior to the process function.
The timestamps assigned using this method is for event time processing.
Timestamps for processing and ingestion time processing are determined by the system.

2 -- How to simulate count-based windows? In this case, what would be the 
best artificial timestamps to append? Just increasing integers? 
I’m not sure of your actual use case here, but if you want to implement count-based windows,
then timers should not need to be part of the implementation. On each processElement, you
should fire results of a window’s state if the window’s element count has reached the
target count.

- Gordon

On 13 April 2018 at 5:36:02 PM, m@xi (makisntpap@gmail.com) wrote:

Hello Flinkers!  

Around here and there one may find some post for sliding windows in Flink. I  
have read that default sliding windows of Flink, the system maintains each  
window separately in memory, which in my case is prohibitive.  

Therefore, I want to implement my own sliding windows through  
ProcessFunction() via onTimer() function. Specifically, let's assume that  
the data does not contain any timestamps. So, if anyone can help providing  
concrete answers or even *code skeletons* on the following bullets, it is  
more than welcome :  

1 -- How do I assign timestamps into my data tuples? What type of  
time...process, event or ingestion?  

2 -- How to simulate count-based windows? In this case, what would be the  
best artificial timestamps to append? Just increasing integers?  

Thanks in advance.  

Best,  
Max  




--  
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/  

Mime
View raw message