spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Shtelma <mshte...@gmail.com>
Subject Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder
Date Mon, 19 Mar 2018 20:05:45 GMT
Hi Keith,

Thank you for the idea!
I have tried it, so now the executor command is looking in the following way :

/bin/bash -c /usr/java/latest//bin/java -server -Xmx51200m
'-Djava.io.tmpdir=my_prefered_path'
-Djava.io.tmpdir=/tmp/hadoop-msh/nm-local-dir/usercache/msh/appcache/application_1521110306769_0041/container_1521110306769_0041_01_000004/tmp

JVM is using the second Djava.io.tmpdir parameter and writing
everything to the same directory as before.

Best,
Michael
Sincerely,
Michael Shtelma


On Mon, Mar 19, 2018 at 6:38 PM, Keith Chapman <keithgchapman@gmail.com> wrote:
> Can you try setting spark.executor.extraJavaOptions to have
> -Djava.io.tmpdir=someValue
>
> Regards,
> Keith.
>
> http://keith-chapman.com
>
> On Mon, Mar 19, 2018 at 10:29 AM, Michael Shtelma <mshtelma@gmail.com>
> wrote:
>>
>> Hi Keith,
>>
>> Thank you for your answer!
>> I have done this, and it is working for spark driver.
>> I would like to make something like this for the executors as well, so
>> that the setting will be used on all the nodes, where I have executors
>> running.
>>
>> Best,
>> Michael
>>
>>
>> On Mon, Mar 19, 2018 at 6:07 PM, Keith Chapman <keithgchapman@gmail.com>
>> wrote:
>> > Hi Michael,
>> >
>> > You could either set spark.local.dir through spark conf or
>> > java.io.tmpdir
>> > system property.
>> >
>> > Regards,
>> > Keith.
>> >
>> > http://keith-chapman.com
>> >
>> > On Mon, Mar 19, 2018 at 9:59 AM, Michael Shtelma <mshtelma@gmail.com>
>> > wrote:
>> >>
>> >> Hi everybody,
>> >>
>> >> I am running spark job on yarn, and my problem is that the blockmgr-*
>> >> folders are being created under
>> >> /tmp/hadoop-msh/nm-local-dir/usercache/msh/appcache/application_id/*
>> >> The size of this folder can grow to a significant size and does not
>> >> really fit into /tmp file system for one job, which makes a real
>> >> problem for my installation.
>> >> I have redefined hadoop.tmp.dir in core-site.xml and
>> >> yarn.nodemanager.local-dirs in yarn-site.xml pointing to other
>> >> location and expected that the block manager will create the files
>> >> there and not under /tmp, but this is not the case. The files are
>> >> created under /tmp.
>> >>
>> >> I am wondering if there is a way to make spark not use /tmp at all and
>> >> configure it to create all the files somewhere else ?
>> >>
>> >> Any assistance would be greatly appreciated!
>> >>
>> >> Best,
>> >> Michael
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>> >>
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message