spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 李明伟 <kramer2...@126.com>
Subject Re:Re: Will the HiveContext cause memory leak ?
Date Wed, 11 May 2016 04:56:34 GMT
Hi  Ted


Spark version :  spark-1.6.0-bin-hadoop2.6
I tried increase the memory of executor. Still have the same problem.
I can use jmap to capture some thing. But the output is too difficult to understand. 










在 2016-05-11 11:50:14,"Ted Yu" <yuzhihong@gmail.com> 写道:

Which Spark release are you using ?


I assume executor crashed due to OOME.


Did you have a chance to capture jmap on the executor before it crashed ?


Have you tried giving more memory to the executor ?


Thanks


On Tue, May 10, 2016 at 8:25 PM, kramer2009@126.com<kramer2009@126.com> wrote:
I submit my code to a spark stand alone cluster. Find the memory usage
executor process keeps growing. Which cause the program to crash.

I modified the code and submit several times. Find below 4 line may causing
the issue

    dataframe =
dataframe.groupBy(['router','interface']).agg(func.sum('bits').alias('bits'))
    windowSpec =
Window.partitionBy(dataframe['router']).orderBy(dataframe['bits'].desc())
    rank = func.dense_rank().over(windowSpec)
    ret =
dataframe.select(dataframe['router'],dataframe['interface'],dataframe['bits'],
rank.alias('rank')).filter("rank<=2")

It looks a little complicated but it is just some Window function on
dataframe. I use the HiveContext because SQLContext do not support window
function yet. Without the 4 line, my code can run all night. Adding them
will cause the memory leak. Program will crash in a few hours.

I will provided the whole code (50 lines)here.  ForAsk01.py
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n26921/ForAsk01.py>
Please advice me if it is a bug..

Also here is the submit command

    nohup ./bin/spark-submit  \
    --master spark://ES01:7077 \
    --executor-memory 4G \
    --num-executors 1 \
    --total-executor-cores 1 \
    --conf "spark.storage.memoryFraction=0.2"  \
    ./ForAsk.py 1>a.log 2>b.log &





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Will-the-HiveContext-cause-memory-leak-tp26921.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org



Mime
View raw message