spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: properties file on a spark cluster
Date Sun, 02 Nov 2014 16:26:44 GMT
The problem here is, when you run a spark program in cluster mode, it will
look for the file in the worker machine. Best approach would be to put the
file in hdfs and use it instead of local path. Another approach would be to
create the same file in the same path on all worker machines and hopefully
it will pick it up from there.

Thanks
Best Regards

On Fri, Oct 31, 2014 at 10:32 PM, Daniel Takabayashi <
takabayashi@scanboo.com.br> wrote:

> Hi Guys,
>
> I'm trying to execute a spark job using python, running on a cluster of
> Yarn (managed by cloudera manager). The python script is using a set of
> python programs installed in each member of cluster. This set of programs
> need an property file found by a local system path.
>
> My problem is:  When this script is sent, using spark-submit, the programs
> can't find this properties file. Running locally as stand-alone job, is no
> problem, the properties file is found.
>
> My questions is:
>
> 1 - What is the problem here ?
> 2 - In this scenario (an script running on a spark yarn cluster that use
> python programs that share same properties file) what is the best approach ?
>
>
> Thank's
> taka
>

Mime
View raw message