spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Spitzer <russell.spit...@gmail.com>
Subject Re: Possibly a memory leak issue in Spark
Date Wed, 22 Sep 2021 18:06:30 GMT
As Sean said I believe you want to be setting

spark.ui.retainedJobs	1000	How many jobs the Spark UI and status APIs remember before garbage
collecting. This is a target maximum, and fewer elements may be retained in some circumstances.
1.2.0
spark.ui.retainedStages	1000	How many stages the Spark UI and status APIs remember before
garbage collecting. This is a target maximum, and fewer elements may be retained in some circumstances.
0.9.0
spark.ui.retainedTasks	100000	How many tasks in one stage the Spark UI and status APIs remember
before garbage collecting. This is a target maximum, and fewer elements may be retained in
some circumstances.	2.0.1

To lower numbers. If i remember correctly this is what controls how much metadata remains
in the driver post task/stage/job competition. 

> On Sep 22, 2021, at 12:42 PM, Kohki Nishio <taroplus@gmail.com> wrote:
> 
> I believe I have enough information, raised this
> 
> https://issues.apache.org/jira/browse/SPARK-36827 <https://issues.apache.org/jira/browse/SPARK-36827>
> 
> thanks
> -Kohki
> 
> 
> On Tue, Sep 21, 2021 at 9:30 PM Sean Owen <srowen@gmail.com <mailto:srowen@gmail.com>>
wrote:
> No, that's just info Spark retains about finished jobs and tasks, likely. You can limit
how much is retained if desired with config. 
> 
> On Tue, Sep 21, 2021, 11:29 PM Kohki Nishio <taroplus@gmail.com <mailto:taroplus@gmail.com>>
wrote:
> Just following up, it looks like task / stage / job data are not cleaned up
> --
>    6:       7835346     2444627952  org.apache.spark.status.TaskDataWrapper
>  25:       3765152      180727296  org.apache.spark.status.StageDataWrapper
> 88:        232255        9290200  org.apache.spark.status.JobDataWrapper
> 
> UI is disabled, not sure why we need to have those data ..
> 
> -Kohki 
> 
> 
> On Fri, Sep 17, 2021 at 8:27 AM Kohki Nishio <taroplus@gmail.com <mailto:taroplus@gmail.com>>
wrote:
> Hello,
> I'm seeing possible memory leak behavior in my spark application. According to MAT, it
looks like it's related to ElementTrackingStore ..
> 
> <Eclipse_Memory_Analyzer.png>
> 
> The increase is subtle, so it takes multiple days to actually cause some impact, but
I'm wondering if anybody has any idea about what this is about ...  Below is the GC graph,
yellow is the level after any GC kicks in.
> 
> <T2_G1GC_-_Grafana.png>
> 
> Thanks
> -- 
> Kohki Nishio
> 
> 
> -- 
> Kohki Nishio
> 
> 
> -- 
> Kohki Nishio


Mime
View raw message