spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nsareen <>
Subject Saving RDD into DB & then Reading back from DB
Date Thu, 13 Nov 2014 07:07:58 GMT
Hi All,

I know that Spark has integration with cassandra DB. Can the RDD be
persisted into DB, be read back into the same state, on server boot ?  If
yes, are there any examples which would demonstrate how it's done ?

We have a requirement, where we are currently saving a snapshot of many rows
into MS SQL Server DB, this snapshot is a readonly snapshot, since we are
migrating our application to Spark, we were thinking of migrating this
snapshot into Spark too, so that it can be referred whenever required and
data can be processed within it. But to do so, i'm assuming we would have to
first create this RDD at runtime which represents this snapshot, and then
persist it in Spark ( either Filesystem or DB ). The reason DB looks more
reasonable is because we don't have a HDFS ecosystem, and in that case would
have to manage creation, archival & other aspects of persistance ourselves.

Any thoughts in this regard will be really helpful.

Thanks in Advance.


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message