spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Domingo Mihovilovic <>
Subject Spark Streaming architecture question - shared memory model
Date Mon, 30 Sep 2013 22:52:51 GMT
I have a quick architecture question. Imagine that you are processing a stream data at high
speed and needs to build, update, and access some memory data structure where the "model"
is stored.  

One option to keep store this model in a DB, but it will require a huge number of updates
(assume we do not want to go to a large Cassandra ring or similar). What's the preferred way
to manage this model in memory to have consistent and shared access across multiple nodes
in the cluster. Is there some sort of shared memory approach or do I need to try to manage
this model as an RDD?

All suggestions are welcome.

View raw message