spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From durga <durgak...@gmail.com>
Subject persistent HDFS instance for cluster restarts/destroys
Date Wed, 23 Jul 2014 22:26:27 GMT
Hi All,
I have a question,

For my company , we are planning to use spark-ec2 scripts to create cluster
for us.

I understand that , persistent HDFS will make the hdfs available for cluster
restarts.

Question is:

1) What happens , If I destroy and re-create , do I loose the data.
    a) If I loose the data , is there only way is to copy to s3 and recopy
after launching the cluster(it seems costly data transfer from and to s3?)
2) How would I add/remove some machines in the cluster?. I mean I am asking
for cluster management.
Is there any place amazon allows to see the machines , and do the operation
of adding and removing?

Thanks,
D.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/persistent-HDFS-instance-for-cluster-restarts-destroys-tp10551.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message