spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aniket Bhatnagar <aniket.bhatna...@gmail.com>
Subject [Streaming] Triggering an action in absence of data
Date Mon, 01 Sep 2014 12:25:00 GMT
Hi all

I am struggling to implement a use case wherein I need to trigger an action
in case no data has been received for X amount of time. I haven't been able
to figure out an easy way to do this. No state/foreach methods get called
when no data has arrived. I thought of generating a 'tick' DStream that
generates an arbitrary object and union/group the tick stream with data
stream to detect that data hasn't arrived for X amount of time. However,
since my data DStream is Paired (has key-value tuple) and I use
updateStateByKey method for processing the data stream, I can't group/union
it with tick stream(s) without knowing all keys in advance.

My second idea was to push data from DStream to an actor and let actor (per
key) manage state and data absent use cases. However, there is no way to
run an actor continuously for all data belonging to a key or a partition.

I am stuck now and can't think of anything else to solve for the use case.
Has anyone else ran into similar issue? Any thoughts on how the use case
could be implemented in Spark streaming?

Thanks,
Aniket

Mime
View raw message