spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Shtelma <mshte...@gmail.com>
Subject Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder
Date Mon, 26 Mar 2018 19:28:49 GMT
Hi Keith,

Thanks  for the suggestion!
I have solved this already.
The problem was, that the yarn process was not responding to
start/stop commands and has not applied my configuration changes.
I have killed it and restarted my cluster, and after that yarn has
started using yarn.nodemanager.local-dirs parameter defined in
yarn-site.xml.
After this change, -Djava.io.tmpdir for the spark executor was set
correctly,  according to yarn.nodemanager.local-dirs parameter.

Best,
Michael


On Mon, Mar 26, 2018 at 9:15 PM, Keith Chapman <keithgchapman@gmail.com> wrote:
> Hi Michael,
>
> sorry for the late reply. I guess you may have to set it through the hdfs
> core-site.xml file. The property you need to set is "hadoop.tmp.dir" which
> defaults to "/tmp/hadoop-${user.name}"
>
> Regards,
> Keith.
>
> http://keith-chapman.com
>
> On Mon, Mar 19, 2018 at 1:05 PM, Michael Shtelma <mshtelma@gmail.com> wrote:
>>
>> Hi Keith,
>>
>> Thank you for the idea!
>> I have tried it, so now the executor command is looking in the following
>> way :
>>
>> /bin/bash -c /usr/java/latest//bin/java -server -Xmx51200m
>> '-Djava.io.tmpdir=my_prefered_path'
>>
>> -Djava.io.tmpdir=/tmp/hadoop-msh/nm-local-dir/usercache/msh/appcache/application_1521110306769_0041/container_1521110306769_0041_01_000004/tmp
>>
>> JVM is using the second Djava.io.tmpdir parameter and writing
>> everything to the same directory as before.
>>
>> Best,
>> Michael
>> Sincerely,
>> Michael Shtelma
>>
>>
>> On Mon, Mar 19, 2018 at 6:38 PM, Keith Chapman <keithgchapman@gmail.com>
>> wrote:
>> > Can you try setting spark.executor.extraJavaOptions to have
>> > -Djava.io.tmpdir=someValue
>> >
>> > Regards,
>> > Keith.
>> >
>> > http://keith-chapman.com
>> >
>> > On Mon, Mar 19, 2018 at 10:29 AM, Michael Shtelma <mshtelma@gmail.com>
>> > wrote:
>> >>
>> >> Hi Keith,
>> >>
>> >> Thank you for your answer!
>> >> I have done this, and it is working for spark driver.
>> >> I would like to make something like this for the executors as well, so
>> >> that the setting will be used on all the nodes, where I have executors
>> >> running.
>> >>
>> >> Best,
>> >> Michael
>> >>
>> >>
>> >> On Mon, Mar 19, 2018 at 6:07 PM, Keith Chapman
>> >> <keithgchapman@gmail.com>
>> >> wrote:
>> >> > Hi Michael,
>> >> >
>> >> > You could either set spark.local.dir through spark conf or
>> >> > java.io.tmpdir
>> >> > system property.
>> >> >
>> >> > Regards,
>> >> > Keith.
>> >> >
>> >> > http://keith-chapman.com
>> >> >
>> >> > On Mon, Mar 19, 2018 at 9:59 AM, Michael Shtelma <mshtelma@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi everybody,
>> >> >>
>> >> >> I am running spark job on yarn, and my problem is that the
>> >> >> blockmgr-*
>> >> >> folders are being created under
>> >> >> /tmp/hadoop-msh/nm-local-dir/usercache/msh/appcache/application_id/*
>> >> >> The size of this folder can grow to a significant size and does
not
>> >> >> really fit into /tmp file system for one job, which makes a real
>> >> >> problem for my installation.
>> >> >> I have redefined hadoop.tmp.dir in core-site.xml and
>> >> >> yarn.nodemanager.local-dirs in yarn-site.xml pointing to other
>> >> >> location and expected that the block manager will create the files
>> >> >> there and not under /tmp, but this is not the case. The files are
>> >> >> created under /tmp.
>> >> >>
>> >> >> I am wondering if there is a way to make spark not use /tmp at
all
>> >> >> and
>> >> >> configure it to create all the files somewhere else ?
>> >> >>
>> >> >> Any assistance would be greatly appreciated!
>> >> >>
>> >> >> Best,
>> >> >> Michael
>> >> >>
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>> >> >>
>> >> >
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message