spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Spark GCE Script
Date Mon, 05 May 2014 17:34:31 GMT
Very cool! Have you thought about sending this as a pull request? We’d be happy to maintain
it inside Spark, though it might be interesting to find a single Python package that can manage
clusters across both EC2 and GCE.

Matei

On May 5, 2014, at 7:18 AM, Akhil Das <akhil@sigmoidanalytics.com> wrote:

> Hi Sparkers,
> 
> We have created a quick spark_gce script which can launch a spark cluster in the Google
Cloud. I'm sharing it because it might be helpful for someone using the Google Cloud for deployment
rather than AWS.
> 
> Here's the link to the script
> 
> https://github.com/sigmoidanalytics/spark_gce
> 
> Feel free to use it and suggest any feedback around it.
> 
> In short here's what it does:
> 
> Just like the spark_ec2 script, this one also reads certain command-line arguments (See
the github page for more details) like the cluster name and all, then starts the machines
in the google cloud, sets up the network, adds a 500GB empty disk to all machines, generate
the ssh keys on master and transfer it to all slaves and install java and downloads and configures
Spark/Shark/Hadoop. Also it starts the shark server automatically. Currently the version is
0.9.1 but I'm happy to add/support more versions if anyone is interested.
> 
> 
> Cheers.
> 
> 
> Thanks
> Best Regards


Mime
View raw message