spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-29905) ExecutorPodsLifecycleManager has sub-optimal behavior with dynamic allocation
Date Mon, 16 Mar 2020 22:52:06 GMT

     [ https://issues.apache.org/jira/browse/SPARK-29905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-29905:
----------------------------------
    Affects Version/s:     (was: 3.0.0)
                       3.1.0

> ExecutorPodsLifecycleManager has sub-optimal behavior with dynamic allocation
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-29905
>                 URL: https://issues.apache.org/jira/browse/SPARK-29905
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.1.0
>            Reporter: Marcelo Masiero Vanzin
>            Priority: Minor
>
> I've been playing with dynamic allocation on k8s and noticed some weird behavior from
ExecutorPodsLifecycleManager when it's on.
> The cause of this behavior is mostly because of the higher rate of pod updates when you
have dynamic allocation. Pods being created and going away all the time generate lots of events,
that are then translated into "snapshots" internally in Spark, and fed to subscribers such
as ExecutorPodsLifecycleManager.
> The first effect of that is that you get a lot of spurious logging. Since snapshots are
incremental, you can get lots of snapshots with the same "PodDeleted" information, for example,
and ExecutorPodsLifecycleManager will log for all of them. Yes, log messages are at debug
level, but if you're debugging that stuff, it's really noisy and distracting.
> The second effect is that the same way you get multiple log messages, you end up calling
into the Spark scheduler, and worse, into the K8S API server, multiple times for the same
pod update. We can optimize that and reduce the chattiness with the API server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message