[ https://issues.apache.org/jira/browse/SPARK-24105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454847#comment-16454847
]
Anirudh Ramanathan commented on SPARK-24105:
--------------------------------------------
> To avoid this deadlock, its required to support node selector (in future affinity/anti-affinity)
configruation by driver & executor.
Would inter-pod anti-affinity be a better bet here for this use-case?
In the extreme case, this is a gang scheduling issue IMO, where we don't want to schedule
drivers if there are no executors that can be scheduled.
There's some work on gang scheduling ongoing in https://github.com/kubernetes/kubernetes/issues/61012
under sig-scheduling.
> Spark 2.3.0 on kubernetes
> -------------------------
>
> Key: SPARK-24105
> URL: https://issues.apache.org/jira/browse/SPARK-24105
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
> Affects Versions: 2.3.0
> Reporter: Lenin
> Priority: Major
>
> Right now its only possible to define node selector configurations thruspark.kubernetes.node.selector.[labelKey].
This gets used for both driver & executor pods. Without the capability to isolate driver
& executor pods, the cluster can run into a livelock scenario, where if there are a
lot of spark submits, can cause the driver pods to fill up the cluster capacity, with no
room for executor pods to do any work.
>
> To avoid this deadlock, its required to support node selector (in future affinity/anti-affinity)
configruation by driver & executor.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|