spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximiliano Patricio Méndez <mmen...@despegar.com>
Subject Dynamic Allocation not removing executors
Date Wed, 15 Aug 2018 19:38:47 GMT
Hi,

I found an issue trying to use dynamic allocation in 2.3.1 where the driver
does not remove idle executors under some circunstances.

For the first instance of this happening, it seems that a change introduced
in 2.2.1/2.3.0 (SPARK-21656
<https://issues.apache.org/jira/browse/SPARK-21656>) added a check
<https://github.com/apache/spark/pull/18874/files> on the
ExecutorAllocationManager that causes the first remove request to be
ignored if there are no pending tasks and the initialExecutors property is
set != 0 (the initializing flag prevents the numExecutorsTarget number to
be changed)

My dynamic allocation conf:
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.initialExecutors 4
spark.dynamicAllocation.minExecutors 0
spark.dynamicAllocation.maxExecutors 100

This normalizes after the first submitted job, but may leave up to 4
executors (in our case) idle without being remove if no job is ever
submitted.

Logs:
18/08/15 13:08:44 DEBUG ExecutorAllocationManager: Starting idle timer for
3 because there are no more tasks scheduled to run on the executor (to
expire in 60 seconds)
18/08/15 13:08:44 INFO ExecutorAllocationManager: New executor 3 has
registered (new total is 1)
18/08/15 13:08:45 DEBUG ExecutorAllocationManager: Starting idle timer for
0 because there are no more tasks scheduled to run on the executor (to
expire in 60 seconds)
18/08/15 13:08:45 INFO ExecutorAllocationManager: New executor 0 has
registered (new total is 2)
18/08/15 13:08:45 DEBUG ExecutorAllocationManager: Starting idle timer for
1 because there are no more tasks scheduled to run on the executor (to
expire in 60 seconds)
18/08/15 13:08:45 INFO ExecutorAllocationManager: New executor 1 has
registered (new total is 3)
18/08/15 13:08:46 DEBUG ExecutorAllocationManager: Starting idle timer for
2 because there are no more tasks scheduled to run on the executor (to
expire in 60 seconds)
18/08/15 13:08:46 INFO ExecutorAllocationManager: New executor 2 has
registered (new total is 4)
18/08/15 13:09:44 INFO ExecutorAllocationManager: Request to remove
executorIds: 3
18/08/15 13:09:44 DEBUG ExecutorAllocationManager: Not removing idle
executor 3 because there are only 4 executor(s) left (number of executor
target 4)
18/08/15 13:09:45 DEBUG ExecutorAllocationManager: Lowering target number
of executors to 0 (previously 4) because not all requested executors are
actually needed
18/08/15 13:09:45 INFO ExecutorAllocationManager: Request to remove
executorIds: 0
18/08/15 13:09:45 INFO ExecutorAllocationManager: Removing executor 0
because it has been idle for 60 seconds (new desired total will be 3)
18/08/15 13:09:45 INFO ExecutorAllocationManager: Request to remove
executorIds: 1
18/08/15 13:09:45 INFO ExecutorAllocationManager: Removing executor 1
because it has been idle for 60 seconds (new desired total will be 2)
18/08/15 13:09:46 INFO ExecutorAllocationManager: Existing executor 0 has
been removed (new total is 3)
18/08/15 13:09:46 DEBUG ExecutorAllocationManager: Executor 0 is no longer
pending to be removed (1 left)
18/08/15 13:09:46 INFO ExecutorAllocationManager: Request to remove
executorIds: 2
18/08/15 13:09:46 INFO ExecutorAllocationManager: Removing executor 2
because it has been idle for 60 seconds (new desired total will be 1)
18/08/15 13:09:46 INFO ExecutorAllocationManager: Existing executor 1 has
been removed (new total is 2)
18/08/15 13:09:46 DEBUG ExecutorAllocationManager: Executor 1 is no longer
pending to be removed (1 left)
18/08/15 13:09:46 INFO ExecutorAllocationManager: Existing executor 2 has
been removed (new total is 1)
18/08/15 13:09:46 DEBUG ExecutorAllocationManager: Executor 2 is no longer
pending to be removed (0 left)

Mime
View raw message