spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Rai <ro...@tuplejump.com>
Subject Re: Spark integration with HDFS and Cassandra simultaneously
Date Sat, 26 Oct 2013 17:07:37 GMT
Hello Gary,

This is very easy to do. You can read your data from HDFS using
FileInputFormat, transform it to a desired rows and write to Cassandra
using ColumnFamilyInputFormat.

Our library called Calliope (Apache Licensed),
http://tuplejump.github.io/calliope/ can make the task of writing to C*
easier.


In case you don't want to convert it to rows and keep them as files in
Cassandra, our lightweight Cassandra backed HDFS compatible filesystem,
SnackFS can help you. SnackFS will be part of next Calliope release later
this month, but we can provide you access if you would like to try it out.

Feel free to mail me directly in case you need any assistance.


Regards,
Rohit
founder @ tuplejump




On Sat, Oct 26, 2013 at 5:45 AM, Gary Malouf <malouf.gary@gmail.com> wrote:

> We have a use case in which much of our raw data is stored in HDFS today.
>  We'd like to write our Spark jobs such that they read/aggregate data from
> HDFS and can output to our Cassandra cluster.
>
> Is there any way of doing this in spark 0.7.3?
>



-- 

____________________________
www.tuplejump.com
*The Data Engineering Platform*

Mime
View raw message