spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-3561) Allow for pluggable execution contexts in Spark
Date Mon, 06 Oct 2014 03:56:34 GMT

    [ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159840#comment-14159840
] 

Patrick Wendell edited comment on SPARK-3561 at 10/6/14 3:55 AM:
-----------------------------------------------------------------

I also changed the title here that reflects the current design doc and pull request. We have
a culture in the project of having JIRA titles reflect accurately the current proposal. We
can change it again if a new doc causes the scope of this to change.

[~ozhurakousky] I'd prefer not to change the title back until there is a new design proposed.
The problem is that people are confusing this with SPARK-3174 and SPARK-3797.


was (Author: pwendell):
I also changed the title here that reflects the current design doc and JIRA. We have a culture
in the project of having JIRA titles reflect accurately the current proposal. We can change
it again if a new doc causes the scope of this to change.

[~ozhurakousky] I'd prefer not to change the title back until there is a new design proposed.
The problem is that people are confusing this with SPARK-3174 and SPARK-3797.

> Allow for pluggable execution contexts in Spark
> -----------------------------------------------
>
>                 Key: SPARK-3561
>                 URL: https://issues.apache.org/jira/browse/SPARK-3561
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 1.1.0
>            Reporter: Oleg Zhurakousky
>              Labels: features
>             Fix For: 1.2.0
>
>         Attachments: SPARK-3561.pdf
>
>
> Currently Spark provides integration with external resource-managers such as Apache Hadoop
YARN, Mesos etc. Specifically in the context of YARN, the current architecture of Spark-on-YARN
can be enhanced to provide significantly better utilization of cluster resources for large
scale, batch and/or ETL applications when run alongside other applications (Spark and others)
and services in YARN. 
> Proposal: 
> The proposed approach would introduce a pluggable JobExecutionContext (trait) - a gateway
and a delegate to Hadoop execution environment - as a non-public api (@DeveloperAPI) not exposed
to end users of Spark. 
> The trait will define 4 only operations: 
> * hadoopFile 
> * newAPIHadoopFile 
> * broadcast 
> * runJob 
> Each method directly maps to the corresponding methods in current version of SparkContext.
JobExecutionContext implementation will be accessed by SparkContext via master URL as "execution-context:foo.bar.MyJobExecutionContext"
with default implementation containing the existing code from SparkContext, thus allowing
current (corresponding) methods of SparkContext to delegate to such implementation. An integrator
will now have an option to provide custom implementation of DefaultExecutionContext by either
implementing it from scratch or extending form DefaultExecutionContext. 
> Please see the attached design doc for more details. 
> Pull Request will be posted shortly as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message