spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Selvam Raman <sel...@gmail.com>
Subject Re: Spark Mlib - java.lang.OutOfMemoryError: Java heap space
Date Mon, 24 Apr 2017 10:31:04 GMT
This is where job going out of memory

17/04/24 10:09:22 INFO TaskSetManager: Finished task 122.0 in stage 1.0
(TID 356) in 4260 ms on ip-...-45.dev (124/234)
17/04/24 10:09:26 INFO BlockManagerInfo: Removed taskresult_361 on
ip-10...-185.dev:36974 in memory (size: 5.2 MB, free: 8.5 GB)
17/04/24 10:09:26 INFO BlockManagerInfo: Removed taskresult_362 on
ip-...-45.dev:40963 in memory (size: 5.2 MB, free: 8.9 GB)
17/04/24 10:09:26 INFO TaskSetManager: Finished task 125.0 in stage 1.0
(TID 359) in 4383 ms on ip-...-45.dev (125/234)
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 15090"...
Killed

Node-45.dev contains 8.9GB free while it throws out of memory. Can anyone
please help me to understand the issue?

On Mon, Apr 24, 2017 at 11:22 AM, Selvam Raman <selmna@gmail.com> wrote:

> Hi,
>
> I have 1 master and 4 slave node. Input data size is 14GB.
> Slave Node config : 32GB Ram,16 core
>
>
> I am trying to train word embedding model using spark. It is going out of
> memory. To train 14GB of data how much memory do i require?.
>
>
> I have givem 20gb per executor but below shows it is using 11.8GB out of
> 20 GB.
> BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-.-.-.dev:35035
> (size: 4.6 KB, free: 11.8 GB)
>
>
> This is the code
> if __name__ == "__main__":
>     sc = SparkContext(appName="Word2VecExample")  # SparkContext
>
>     # $example on$
>     inp = sc.textFile("s3://word2vec/data/word2vec_word_data.txt/").map(lambda
> row: row.split(" "))
>
>     word2vec = Word2Vec()
>     model = word2vec.fit(inp)
>
>     model.save(sc, "s3://pysparkml/word2vecresult2/")
>     sc.stop()
>
>
> Spark-submit Command:
> spark-submit --master yarn --conf 'spark.executor.extraJavaOptions=-XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/mnt/tmp -XX:+UseG1GC -XX:+UseG1GC -XX:+PrintFlagsFinal
> -XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy
> -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark' --num-executors 4
> --executor-cores 2 --executor-memory 20g Word2VecExample.py
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"

Mime
View raw message