spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Will the HiveContext cause memory leak ?
Date Wed, 11 May 2016 03:50:14 GMT
Which Spark release are you using ?

I assume executor crashed due to OOME.

Did you have a chance to capture jmap on the executor before it crashed ?

Have you tried giving more memory to the executor ?

Thanks

On Tue, May 10, 2016 at 8:25 PM, kramer2009@126.com <kramer2009@126.com>
wrote:

> I submit my code to a spark stand alone cluster. Find the memory usage
> executor process keeps growing. Which cause the program to crash.
>
> I modified the code and submit several times. Find below 4 line may causing
> the issue
>
>     dataframe =
>
> dataframe.groupBy(['router','interface']).agg(func.sum('bits').alias('bits'))
>     windowSpec =
> Window.partitionBy(dataframe['router']).orderBy(dataframe['bits'].desc())
>     rank = func.dense_rank().over(windowSpec)
>     ret =
>
> dataframe.select(dataframe['router'],dataframe['interface'],dataframe['bits'],
> rank.alias('rank')).filter("rank<=2")
>
> It looks a little complicated but it is just some Window function on
> dataframe. I use the HiveContext because SQLContext do not support window
> function yet. Without the 4 line, my code can run all night. Adding them
> will cause the memory leak. Program will crash in a few hours.
>
> I will provided the whole code (50 lines)here.  ForAsk01.py
> <
> http://apache-spark-user-list.1001560.n3.nabble.com/file/n26921/ForAsk01.py
> >
> Please advice me if it is a bug..
>
> Also here is the submit command
>
>     nohup ./bin/spark-submit  \
>     --master spark://ES01:7077 \
>     --executor-memory 4G \
>     --num-executors 1 \
>     --total-executor-cores 1 \
>     --conf "spark.storage.memoryFraction=0.2"  \
>     ./ForAsk.py 1>a.log 2>b.log &
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Will-the-HiveContext-cause-memory-leak-tp26921.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message