[ https://issues.apache.org/jira/browse/SPARK-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289190#comment-14289190
]
Sean Owen commented on SPARK-1526:
----------------------------------
This may be a little bold in closing, but there's been no activity and I do not see an actionable
change here, but I think there is a fine workaround for this case. Yes it's a pretty fundamental
property of Spark that the driver communicates a lot with the executors, and I can't see that
changing. You can of course run the driver remotely; it's a matter of network config, and
having enough network bandwidth to support however much communication your driver/executors
need, which is not necessarily a lot. Finally, you can access resources like DBs from your
executors too, of course. In fact that is probably more sensible than loading to the driver,
then copying again to executors.
> Running spark driver program from my local machine
> --------------------------------------------------
>
> Key: SPARK-1526
> URL: https://issues.apache.org/jira/browse/SPARK-1526
> Project: Spark
> Issue Type: Wish
> Components: Spark Core
> Reporter: Idan Zalzberg
>
> Currently it seems that the design choice is that the driver program should be close
network-wise to the worker and allow connections to be created from either side.
> This makes using Spark somewhat harder since when I develop locally I not only to package
all my program, but also all it's local dependencies.
> let's say I have a local DB with names of files in HADOOP that I want to process with
spark, now I need my local DB to be accessible from the cluster so it can fetch the file names
in runtime.
> The driver program is an awesome thing, but it loses some of it's strength if you can't
really run it anywhere.
> It seems to me that the problem is with the DAGScheduler that needs to be close to the
worker, maybe it shouldn't be embedded in the driver then?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|