spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Ulanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5386) Reduce fails with vectors of big length
Date Fri, 23 Jan 2015 18:29:34 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289677#comment-14289677
] 

Alexander Ulanov commented on SPARK-5386:
-----------------------------------------

My spark-env.sh contains:
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=8g
export SPARK_WORKER_INSTANCES=2
I run spark-shell with ./spark-shell --executor-memory 8G --driver-memory 8G. In Spark-UI
each worker has 8GB of memory. 

Btw., I run this code once again and this time it does not crash and keep trying to shedule
the job for the failing node that tries to allocate memory and fails and so on. Is it a normal
behavior?

> Reduce fails with vectors of big length
> ---------------------------------------
>
>                 Key: SPARK-5386
>                 URL: https://issues.apache.org/jira/browse/SPARK-5386
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0
>         Environment: Overall:
> 6 machine cluster (Xeon 3.3GHz 4 cores, 16GB RAM, Ubuntu), each runs 2 Workers
> Spark:
> ./spark-shell --executor-memory 8G --driver-memory 8G
> spark.driver.maxResultSize 0
> "java.io.tmpdir" and "spark.local.dir" set to a disk with a lot of free space
>            Reporter: Alexander Ulanov
>             Fix For: 1.3.0
>
>
> Code:
> import org.apache.spark.mllib.rdd.RDDFunctions._
> import breeze.linalg._
> import org.apache.log4j._
> Logger.getRootLogger.setLevel(Level.OFF)
> val n = 60000000
> val p = 12
> val vv = sc.parallelize(0 until p, p).map(i => DenseVector.rand[Double]( n ))
> vv.reduce(_ + _)
> When executing in shell it crashes after some period of time. One of the node contain
the following in stdout:
> Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000755500000,
2863661056, 0) failed; error='Cannot allocate memory' (errno=12)
> #
> # There is insufficient memory for the Java Runtime Environment to continue.
> # Native memory allocation (malloc) failed to allocate 2863661056 bytes for committing
reserved memory.
> # An error report file with more information is saved as:
> # /datac/spark/app-20150123091936-0000/89/hs_err_pid2247.log
> During the execution there is a message: Job aborted due to stage failure: Exception
while getting task result: java.io.IOException: Connection from server-12.net/10.10.10.10:54701
closed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message