spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: possible bug
Date Fri, 09 Apr 2021 16:31:43 GMT
OK so it's '70000 threads overwhelming off heap mem in the JVM' kind of
thing. Or running afoul of ulimits in the OS.

On Fri, Apr 9, 2021 at 11:19 AM Attila Zsolt Piros <
piros.attila.zsolt@gmail.com> wrote:

> Hi Sean!
>
> So the "coalesce" without shuffle will create a CoalescedRDD which during
> its computation delegates to the parent RDD partitions.
> As the CoalescedRDD contains only 1 partition so we talk about 1 task and
> 1 task context.
>
> The next stop is PythonRunner.
>
> Here the python workers at least are reused (when
> "spark.python.worker.reuse" is true, and true is the default) but the
> MonitorThreads are not reused and what is worse all the MonitorThreads are
> created for the same worker and same TaskContext.
> This means the CoalescedRDD's 1 tasks should be completed to stop the
> first monitor thread, relevant code:
>
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala#L570
>
> So this will lead to creating 70000 extra threads when 1 would be enough.
>
> The jira is: https://issues.apache.org/jira/browse/SPARK-35009
> The PR will next week maybe (I am a bit uncertain as I have many other
> things to do right now).
>
> Best Regards,
> Attila
>
>>
>>>

Mime
View raw message