spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Chapman <keithgchap...@gmail.com>
Subject Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder
Date Mon, 26 Mar 2018 19:15:54 GMT
Hi Michael,

sorry for the late reply. I guess you may have to set it through the hdfs
core-site.xml file. The property you need to set is "hadoop.tmp.dir" which
defaults to "/tmp/hadoop-${user.name}"

Regards,
Keith.

http://keith-chapman.com

On Mon, Mar 19, 2018 at 1:05 PM, Michael Shtelma <mshtelma@gmail.com> wrote:

> Hi Keith,
>
> Thank you for the idea!
> I have tried it, so now the executor command is looking in the following
> way :
>
> /bin/bash -c /usr/java/latest//bin/java -server -Xmx51200m
> '-Djava.io.tmpdir=my_prefered_path'
> -Djava.io.tmpdir=/tmp/hadoop-msh/nm-local-dir/usercache/
> msh/appcache/application_1521110306769_0041/container_
> 1521110306769_0041_01_000004/tmp
>
> JVM is using the second Djava.io.tmpdir parameter and writing
> everything to the same directory as before.
>
> Best,
> Michael
> Sincerely,
> Michael Shtelma
>
>
> On Mon, Mar 19, 2018 at 6:38 PM, Keith Chapman <keithgchapman@gmail.com>
> wrote:
> > Can you try setting spark.executor.extraJavaOptions to have
> > -Djava.io.tmpdir=someValue
> >
> > Regards,
> > Keith.
> >
> > http://keith-chapman.com
> >
> > On Mon, Mar 19, 2018 at 10:29 AM, Michael Shtelma <mshtelma@gmail.com>
> > wrote:
> >>
> >> Hi Keith,
> >>
> >> Thank you for your answer!
> >> I have done this, and it is working for spark driver.
> >> I would like to make something like this for the executors as well, so
> >> that the setting will be used on all the nodes, where I have executors
> >> running.
> >>
> >> Best,
> >> Michael
> >>
> >>
> >> On Mon, Mar 19, 2018 at 6:07 PM, Keith Chapman <keithgchapman@gmail.com
> >
> >> wrote:
> >> > Hi Michael,
> >> >
> >> > You could either set spark.local.dir through spark conf or
> >> > java.io.tmpdir
> >> > system property.
> >> >
> >> > Regards,
> >> > Keith.
> >> >
> >> > http://keith-chapman.com
> >> >
> >> > On Mon, Mar 19, 2018 at 9:59 AM, Michael Shtelma <mshtelma@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi everybody,
> >> >>
> >> >> I am running spark job on yarn, and my problem is that the blockmgr-*
> >> >> folders are being created under
> >> >> /tmp/hadoop-msh/nm-local-dir/usercache/msh/appcache/application_id/*
> >> >> The size of this folder can grow to a significant size and does not
> >> >> really fit into /tmp file system for one job, which makes a real
> >> >> problem for my installation.
> >> >> I have redefined hadoop.tmp.dir in core-site.xml and
> >> >> yarn.nodemanager.local-dirs in yarn-site.xml pointing to other
> >> >> location and expected that the block manager will create the files
> >> >> there and not under /tmp, but this is not the case. The files are
> >> >> created under /tmp.
> >> >>
> >> >> I am wondering if there is a way to make spark not use /tmp at all
> and
> >> >> configure it to create all the files somewhere else ?
> >> >>
> >> >> Any assistance would be greatly appreciated!
> >> >>
> >> >> Best,
> >> >> Michael
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >> >>
> >> >
> >
> >
>

Mime
View raw message