spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kapil Malik <kma...@adobe.com>
Subject hdfs replication on saving RDD
Date Sun, 05 Jan 2014 15:20:43 GMT
Hi all,

I've a spark cluster on top of an HDFS cluster (3 nodes). The hdfs replication is 2. So if
I upload a file  : hadoop fs -put something.txt, it is replicated to 2 nodes.
However, when I do rdd.saveAsTextFile ( .. ), it's saved with replication factor 3 (i.e. on
all nodes). How do I configure to save a text file with the same replication factor as specified
for hadoop ?

Thanks,

Kapil Malik | kmalik@adobe.com<mailto:kmalik@adobe.com> | 33430 / 8800836581
Go Corona : http://harrypotter:4502/CoronaClient.html


Mime
View raw message