spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: k8s orchestrating Spark service
Date Mon, 01 Jul 2019 23:57:20 GMT
k8s as master would be nice but doesn’t solve the problem of running the
full cluster and is an orthogonal issue.

We’d like to deploy Spark Workers/Executors and Master (whatever master is
easiest to talk about since we really don’t care) in pods as we do with the
other services we use. Replace Spark Master with k8s if you insist. How do
the executors get deployed?

We have our own containers that almost work for 2.3.3. We have used this
before with older Spark so we are reasonably sure it makes sense. We just
wonder if our own image builds and charts are the best starting point.

Does anyone have something they like?

From: Matt Cheah <> <>
Reply: Matt Cheah <> <>
Date: July 1, 2019 at 4:45:55 PM
To: Pat Ferrel <> <>, <> <>
Subject:  Re: k8s orchestrating Spark service

Sorry, I don’t quite follow – why use the Spark standalone cluster as an
in-between layer when one can just deploy the Spark application directly
inside the Helm chart? I’m curious as to what the use case is, since I’m
wondering if there’s something we can improve with respect to the native
integration with Kubernetes here. Deploying on Spark standalone mode in
Kubernetes is, to my understanding, meant to be superseded by the native
integration introduced in Spark 2.4.

*From: *Pat Ferrel <>
*Date: *Monday, July 1, 2019 at 4:40 PM
*To: *"" <>, Matt Cheah <>
*Subject: *Re: k8s orchestrating Spark service

Thanks Matt,

Actually I can’t use spark-submit. We submit the Driver programmatically
through the API. But this is not the issue and using k8s as the master is
also not the issue though you may be right about it being easier, it
doesn’t quite get to the heart.

We want to orchestrate a bunch of services including Spark. The rest work,
we are asking if anyone has seen a good starting point for adding Spark as
a k8s managed service.

From: Matt Cheah <> <>
Reply: Matt Cheah <> <>
Date: July 1, 2019 at 3:26:20 PM
To: Pat Ferrel <> <>, <> <>
Subject:  Re: k8s orchestrating Spark service

I would recommend looking into Spark’s native support for running on
Kubernetes. One can just start the application against Kubernetes directly
using spark-submit in cluster mode or starting the Spark context with the
right parameters in client mode. See

I would think that building Helm around this architecture of running Spark
applications would be easier than running a Spark standalone cluster. But
admittedly I’m not very familiar with the Helm technology – we just use

-Matt Cheah

*From: *Pat Ferrel <>
*Date: *Sunday, June 30, 2019 at 12:55 PM
*To: *"" <>
*Subject: *k8s orchestrating Spark service

We're trying to setup a system that includes Spark. The rest of the
services have good Docker containers and Helm charts to start from.

Spark on the other hand is proving difficult. We forked a container and
have tried to create our own chart but are having several problems with

So back to the community… Can anyone recommend a Docker Container + Helm
Chart for use with Kubernetes to orchestrate:

   - Spark standalone Master
   - several Spark Workers/Executors

This not a request to use k8s to orchestrate Spark Jobs, but the service
cluster itself.


View raw message