Hi All,
I have a question,
For my company , we are planning to use spark-ec2 scripts to create cluster
for us.
I understand that , persistent HDFS will make the hdfs available for cluster
restarts.
Question is:
1) What happens , If I destroy and re-create , do I loose the data.
a) If I loose the data , is there only way is to copy to s3 and recopy
after launching the cluster(it seems costly data transfer from and to s3?)
2) How would I add/remove some machines in the cluster?. I mean I am asking
for cluster management.
Is there any place amazon allows to see the machines , and do the operation
of adding and removing?
Thanks,
D.
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/persistent-HDFS-instance-for-cluster-restarts-destroys-tp10551.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
|