I guess I was focusing on this: 

#2 
I want to do the above as a event driven way, without using the batches (i tried micro batches, but I realised that’s not what I want), i.e., for each arriving event or as soon as a event message come my stream, not by accumulating the event 

If you want to update your graph without pulling the older data back through the entire DAG, it seems like you need to store the graph data somewhere. So that's why I jumped to accumulators - the state would be around from event to event, and not require a "reaggregation" for each event.

Arbitrary stateful streaming has this ability "built in" - that is, the engine stores your intermediate results in RAM and then the next event picks up where the last one left off. 

I've just implemented the arbitrary stateful streaming option... Couldn't figure out a better way of avoiding the re-shuffle, so ended up keeping the prior state in the engine.

I'm not using GraphX, but it seems like the approach should work irrespective - there's an interface called GroupState that you hand off an iterator for from call to call.

Do keep in mind that you have to think about out of order event arrivals...

Send me a message to my direct email and I'll provide a link to the source... Not sure I'm fully grokking your entire use case...


On Fri, Apr 5, 2019 at 1:15 PM Basavaraj <rajiff@gmail.com> wrote:
I have checked broadcast of accumulated values, but not satellite stateful stabbing

But, I am not sure how that helps here

On Fri, 5 Apr 2019, 10:13 pm Jason Nerothin, <jasonnerothin@gmail.com> wrote:
Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators?

On Fri, Apr 5, 2019 at 10:55 AM Basavaraj <rajiff@gmail.com> wrote:
Hi

Have two questions

#1
I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can help in this by providing some native support

#2
I want to do the above as a event driven way, without using the batches (i tried micro batches, but I realised that’s not what I want), i.e., for each arriving event or as soon as a event message come my stream, not by accumulating the event

I humbly welcome any pointers, constructive criticism

Regards
Basav
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscribe@spark.apache.org


--
Thanks,
Jason


--
Thanks,
Jason