spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Chammas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-6220) Allow extended EC2 options to be passed through spark-ec2
Date Tue, 10 Mar 2015 03:42:39 GMT

    [ https://issues.apache.org/jira/browse/SPARK-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354217#comment-14354217
] 

Nicholas Chammas commented on SPARK-6220:
-----------------------------------------

I took another look at the 2 boto methods we'd be passing these options to.
* [{{boto.ec2.image.Image.run}}|http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.image.Image.run]
* [{{boto.ec2.connection.EC2Connection.request_spot_instances}}|http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.connection.EC2Connection.request_spot_instances]

The parameter types they take are quite varied, from {{bool}} to {{string}} to {{list(string)}}
to {{list(boto.ec2.networkinterface.NetworkInterfaceSpecification)}}. Covering them generically,
even just a subset of them, would require us to take input that can be type cast somehow--maybe
some kind of stripped-down JSON.

I'm not sure we want to do that to spark-ec2.

Maybe instead I should just add the options I need to support {{instance_profile_arn}} / {{instance_profile_name}}
(for IAM support) and {{instance_initiated_shutdown_behavior}} (for self-terminating clusters)
and call it a day.

[~shivaram], [~joshrosen], [~pwendell]: What do y'all think?

> Allow extended EC2 options to be passed through spark-ec2
> ---------------------------------------------------------
>
>                 Key: SPARK-6220
>                 URL: https://issues.apache.org/jira/browse/SPARK-6220
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Priority: Minor
>
> There are many EC2 options exposed by the boto library that spark-ec2 uses. 
> Over time, many of these EC2 options have been bubbled up here and there to become spark-ec2
options.
> Examples:
> * spot prices
> * placement groups
> * VPC, subnet, and security group assignments
> It's likely that more and more EC2 options will trickle up like this to become spark-ec2
options.
> While major options are well suited to this type of promotion, we should probably allow
users to pass through EC2 options they want to use through spark-ec2 in some generic way.
> Let's add two options:
> * {{--ec2-instance-option}} -> [{{boto::run}}|http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.image.Image.run]
> * {{--ec2-spot-instance-option}} -> [{{boto::request_spot_instances}}|http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.connection.EC2Connection.request_spot_instances]
> Each option can be specified multiple times and is simply passed directly to the underlying
boto call.
> For example:
> {code}
> spark-ec2 \
>     ...
>     --ec2-instance-option "instance_initiated_shutdown_behavior=terminate" \
>     --ec2-instance-option "ebs_optimized=True"
> {code}
> I'm not sure about the exact syntax of the extended options, but something like this
will do the trick as long as it can be made to pass the options correctly to boto in most
cases.
> I followed the example of {{ssh}}, which supports multiple extended options similarly.
> {code}
> ssh -o LogLevel=ERROR -o UserKnowHostsFile=/dev/null ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message