spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Configuring shuffle write directory
Date Fri, 28 Mar 2014 05:05:47 GMT
I see, are you sure that was in spark-env.sh instead of spark-env.sh.template? You need to
copy it to just a .sh file. Also make sure the file is executable.

Try doing println(sc.getConf.toDebugString) in your driver program and seeing what properties
it prints. As far as I can tell, spark.local.dir should *not* be set there, so workers should
get it from their spark-env.sh. It’s true that if you set spark.local.dir in the driver
it would pass that on to the workers for that job.

Matei

On Mar 27, 2014, at 9:57 PM, Tsai Li Ming <mailinglist@ltsai.com> wrote:

> Yes, I have tried that by adding it to the Worker. I can see the "app-20140328124540-000”
in the local spark directory of the worker.
> 
> But the “spark-local” directories are always written to /tmp since is the default
spark.local.dir is taken from java.io.tempdir?
> 
> 
> 
> On 28 Mar, 2014, at 12:42 pm, Matei Zaharia <matei.zaharia@gmail.com> wrote:
> 
>> Yes, the problem is that the driver program is overriding it. Have you set it manually
in the driver? Or how did you try setting it in workers? You should set it by adding
>> 
>> export SPARK_JAVA_OPTS=“-Dspark.local.dir=whatever”
>> 
>> to conf/spark-env.sh on those workers.
>> 
>> Matei
>> 
>> On Mar 27, 2014, at 9:04 PM, Tsai Li Ming <mailinglist@ltsai.com> wrote:
>> 
>>> Anyone can help?
>>> 
>>> How can I configure a different spark.local.dir for each executor?
>>> 
>>> 
>>> On 23 Mar, 2014, at 12:11 am, Tsai Li Ming <mailinglist@ltsai.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Each of my worker node has its own unique spark.local.dir.
>>>> 
>>>> However, when I run spark-shell, the shuffle writes are always written to
/tmp despite being set when the worker node is started.
>>>> 
>>>> By specifying the spark.local.dir for the driver program, it seems to override
the executor? Is there a way to properly define it in the worker node?
>>>> 
>>>> Thanks!
>>> 
>> 
> 


Mime
View raw message