spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ranju Jain <Ranju.J...@ericsson.com.INVALID>
Subject Dynamic Allocation Backlog Property in Spark on Kubernetes
Date Tue, 06 Apr 2021 11:58:30 GMT
Hi All,

I have set dynamic allocation enabled while running spark on Kubernetes . But new executors
are requested if pending tasks are backlogged for more than configured duration in property
"spark.dynamicAllocation.schedulerBacklogTimeout".

My Use Case is:

There are number of parallel jobs which might or might not run together at a particular point
of time. E.g Only One Spark Job may run at a point of time or two spark jobs may run at a
single point of time depending upon the need.
I configured spark.dynamicAllocation.minExecutors as 3 and spark.dynamicAllocation.maxExecutors
as 8 .

Steps:

  1.  SparkContext initialized with 3 executors and First Job requested.
  2.  Now, if second job requested after few mins  (e.g 15 mins) , I am thinking if I can
use the benefit of dynamic allocation and executor should scale up to handle second job tasks.

For this I think "spark.dynamicAllocation.schedulerBacklogTimeout" needs to set after which
new executors would be requested.

Problem: Problem is there are chances that second job is not requested at all or may be requested
after 10 mins or after 20 mins. How can I set a constant value for

property "spark.dynamicAllocation.schedulerBacklogTimeout" to scale the executors , when tasks
backlog is dependent upon the number of jobs requested.


Regards
Ranju

Mime
View raw message