spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From François Le lay <...@spotify.com>
Subject Re: Spark GCE Script
Date Mon, 05 May 2014 19:30:28 GMT
Has anyone considered using jclouds tooling to support multiple cloud providers? Maybe using
Pallet?

François

> On May 5, 2014, at 3:22 PM, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
> 
> I second this motion. :)
> 
> A unified "cloud deployment" tool would be absolutely great.
> 
> 
> On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
>> Very cool! Have you thought about sending this as a pull request? We’d be happy
to maintain it inside Spark, though it might be interesting to find a single Python package
that can manage clusters across both EC2 and GCE.
>> 
>> Matei
>> 
>>> On May 5, 2014, at 7:18 AM, Akhil Das <akhil@sigmoidanalytics.com> wrote:
>>> 
>>> Hi Sparkers,
>>> 
>>> We have created a quick spark_gce script which can launch a spark cluster in
the Google Cloud. I'm sharing it because it might be helpful for someone using the Google
Cloud for deployment rather than AWS.
>>> 
>>> Here's the link to the script
>>> 
>>> https://github.com/sigmoidanalytics/spark_gce
>>> 
>>> Feel free to use it and suggest any feedback around it.
>>> 
>>> In short here's what it does:
>>> 
>>> Just like the spark_ec2 script, this one also reads certain command-line arguments
(See the github page for more details) like the cluster name and all, then starts the machines
in the google cloud, sets up the network, adds a 500GB empty disk to all machines, generate
the ssh keys on master and transfer it to all slaves and install java and downloads and configures
Spark/Shark/Hadoop. Also it starts the shark server automatically. Currently the version is
0.9.1 but I'm happy to add/support more versions if anyone is interested.
>>> 
>>> 
>>> Cheers.
>>> 
>>> 
>>> Thanks
>>> Best Regards
> 

Mime
View raw message