spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yifan LI <iamyifa...@gmail.com>
Subject Re: No space left on device??
Date Wed, 06 May 2015 13:21:17 GMT
Yes, you are right. For now I have to say the workload/executor is distributed evenly…so,
like you said, it is difficult to improve the situation.

However, have you any idea of how to make a *skew* data/executor distribution? 



Best,
Yifan LI





> On 06 May 2015, at 15:13, Saisai Shao <sai.sai.shao@gmail.com> wrote:
> 
> I think it depends on your workload and executor distribution, if your workload is evenly
distributed without any big data skew, and executors are evenly distributed on each nodes,
the storage usage of each node is nearly the same. Spark itself cannot rebalance the storage
overhead as you mentioned.
> 
> 2015-05-06 21:09 GMT+08:00 Yifan LI <iamyifanli@gmail.com <mailto:iamyifanli@gmail.com>>:
> Thanks, Shao. :-)
> 
> I am wondering if the spark will rebalance the storage overhead in runtime…since still
there is some available space on other nodes.
> 
> 
> Best,
> Yifan LI
> 
> 
> 
> 
> 
>> On 06 May 2015, at 14:57, Saisai Shao <sai.sai.shao@gmail.com <mailto:sai.sai.shao@gmail.com>>
wrote:
>> 
>> I think you could configure multiple disks through spark.local.dir, default is /tmp.
Anyway if your intermediate data is larger than available disk space, still will meet this
issue.
>> 
>> spark.local.dir	/tmp	Directory to use for "scratch" space in Spark, including map
output files and RDDs that get stored on disk. This should be on a fast, local disk in your
system. It can also be a comma-separated list of multiple directories on different disks.
NOTE: In Spark 1.0 and later this will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos)
or LOCAL_DIRS (YARN) environment variables set by the cluster manager.
>> 
>> 2015-05-06 20:35 GMT+08:00 Yifan LI <iamyifanli@gmail.com <mailto:iamyifanli@gmail.com>>:
>> Hi,
>> 
>> I am running my graphx application on Spark, but it failed since there is an error
on one executor node(on which available hdfs space is small) that “no space left on device”.
>> 
>> I can understand why it happened, because my vertex(-attribute) rdd was becoming
bigger and bigger during computation…, so maybe sometime the request on that node was too
bigger than available space.
>> 
>> But, is there any way to avoid this kind of error? I am sure that the overall disk
space of all nodes is enough for my application.
>> 
>> Thanks in advance!
>> 
>> 
>> 
>> Best,
>> Yifan LI
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 


Mime
View raw message