Hi Matthias,

This doeesn't look possible now.  It may be worth filing an improvement jira for.

But I'm trying to understand what you're trying to do a little better.  So you intentionally have each thread create a new unique pool when its submits a job?  So that pool will just get the default pool configuration, and you will see lots of these messages in your logs?

https://github.com/apache/spark/blob/6ade5cbb498f6c6ea38779b97f2325d5cf5013f2/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala#L196-L200

What is the use case for creating pools this way?

Also if I understand correctly, it doesn't even matter if the thread dies -- that pool will still stay around, as the rootPool will retain a reference to its (the pools aren't really actually tied to specific threads).

Imran

On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboehm7@gmail.com> wrote:
Hi all,

for concurrent Spark jobs spawned from the driver, we use Spark's fair
scheduler pools, which are set and unset in a thread-local manner by
each worker thread. Typically (for rather long jobs), this works very
well. Unfortunately, in an application with lots of very short
parallel sections, we see 1000s of these pools remaining in the Spark
UI, which indicates some kind of leak. Each worker cleans up its local
property by setting it to null, but not all pools are properly
removed. I've checked and reproduced this behavior with Spark 2.1-2.3.

Now my question: Is there a way to explicitly remove these pools,
either globally, or locally while the thread is still alive?

Regards,
Matthias

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org