spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: Runnning a Spark Shell locally against EC2
Date Wed, 06 Aug 2014 18:43:33 GMT
This will be awesome - it's been one of the major issues for our analytics
team as they hope to use their own python libraries.


On Wed, Aug 6, 2014 at 2:40 PM, Andrew Or <andrew@databricks.com> wrote:

> Hi Gary,
>
> This has indeed been a limitation of Spark, in that drivers and executors
> use random ephemeral ports to talk to each other. If you are submitting a
> Spark job from your local machine in client mode (meaning, the driver runs
> on your machine), you will need to open up all TCP ports from your worker
> machines, a requirement that is not super secure. However, a very recent
> commit changes this (
> https://github.com/apache/spark/commit/09f7e4587bbdf74207d2629e8c1314f93d865999)
> in that you can now manually configure all ports and only open up the ones
> you configured. This will be available in Spark 1.1.
>
> -Andrew
>
>
> 2014-08-06 8:29 GMT-07:00 Gary Malouf <malouf.gary@gmail.com>:
>
> We have Spark 1.0.1 on Mesos deployed as a cluster in EC2.  Our Devops
>> lead tells me that Spark jobs can not be submitted from local machines due
>> to the complexity of opening the right ports to the world etc.
>>
>> Are other people running the shell locally in a production environment?
>>
>
>

Mime
View raw message