spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guy Doulberg <>
Subject RDD decouple store implementations
Date Thu, 27 Nov 2014 08:02:01 GMT
Hi guys

I am playing with spark, and I was thinking if there is a way to share RDD across multiple
implementations in  a decoupled way ,

i.e assuming I have RDD that comes from a stream in spark streaming, I want to be able to
store the same stream on two different s3 folders using two different formatters, let's say
that one formatter write a JSON file and another CSV file. Is it possible?

If it is possible is it possible also to change the implantation of the CSV formatter without
stopping the JSON file writer?

Thanks, Guy Doulberg

View raw message