spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Ningjun (LNG-NPV)" <ningjun.w...@lexisnexis.com>
Subject Can I save RDD to local file system and then read it back on spark cluster with multiple nodes?
Date Wed, 14 Jan 2015 17:12:11 GMT
Can I save RDD to the local file system and then read it back on a spark cluster with multiple
nodes?

rdd.saveAsObjectFile("file:///home/data/rdd1<file:///\\home\data\rdd1>")

val rdd2 = sc.objectFile("file:///home/data/rdd1<file:///\\home\data\rdd1>")

This will works if the cluster has only one node. But my cluster has 3 nodes and each node
has a local dir called /home/data. Is rdd saved to the local dir across 3 nodes? If so, does
sc.objectFile(...) smart enough to read the local dir in all nodes to merge them into a single
rdd?

Ningjun


Mime
View raw message