spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anirudh Ramanathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-24028) [K8s] Creating secrets and config maps before creating the driver pod has unpredictable behavior
Date Thu, 19 Apr 2018 22:23:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-24028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444902#comment-16444902
] 

Anirudh Ramanathan commented on SPARK-24028:
--------------------------------------------

My suspicion here is that this has to do with timing. An easy way to check may be to add a
sleep() of a few seconds during driver pod startup and seeing if the issue resolves itself.
Looks like there may have been a race condition with the storage mounting logic in the past,
but if you're seeing this fresh in 1.9.4, that is something we should file a bug about in
upstream. 

All the recent runs of https://k8s-testgrid.appspot.com/sig-big-data#spark-periodic-latest-gke
on v1.9.6 have been green. Any ideas on how we can reproduce this?

> [K8s] Creating secrets and config maps before creating the driver pod has unpredictable
behavior
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24028
>                 URL: https://issues.apache.org/jira/browse/SPARK-24028
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.3.0
>            Reporter: Matt Cheah
>            Priority: Critical
>
> Currently we create the Kubernetes resources the driver depends on - such as the properties
config map and secrets to mount into the pod - only after we create the driver pod. This is
because we want these extra objects to immediately have an owner reference to be tied to the
driver pod.
> On our Kubernetes 1.9.4. cluster, we're seeing that sometimes this works fine, but other
times the driver ends up being started with empty volumes instead of volumes with the contents
of the secrets we expect. The result is that sometimes the driver will start without these
files mounted, which leads to various failures if the driver requires these files to be present
early on in their code. Missing the properties file config map, for example, would mean spark-submit
doesn't have a properties file to read at all. See the warning on [https://kubernetes.io/docs/concepts/storage/volumes/#secret.]
> Unfortunately we cannot link owner references to non-existent objects, so we have to
do this instead:
>  # Create the auxiliary resources without any owner references.
>  # Create the driver pod mounting these resources into volumes, as before.
>  # If #2 fails, clean up the resources created in #1.
>  # Edit the auxiliary resources to have an owner reference for the driver pod.
> The multi-step approach leaves a small chance for us to leak resources - for example,
if we fail to make the resource edits in #4 for some reason. This also changes the permissioning
mode required for spark-submit - credentials provided to spark-submit need to be able to edit
resources in addition to creating them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message