spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: [k8s] Spark operator (the Java one)
Date Thu, 10 Oct 2019 16:42:54 GMT
I'd have the same question on the PR - why does this need to be in the
Apache Spark project vs where it is now? Yes, it's not a Spark package
per se, but it seems like this is a tool for K8S to use Spark rather
than a core Spark tool.

Yes of course all the packages, licenses, etc have to be overhauled,
but that kind of underscores that this is a dump of a third party tool
that works fine on its own?

On Thu, Oct 10, 2019 at 9:30 AM Jiri Kremser <jkremser@redhat.com> wrote:
>
> Hello,
>
>
> Spark Operator is a tool that can deploy/scale and help with monitoring of Spark clusters
on Kubernetes. It follows the operator pattern [1] introduced by CoreOS so it watches for
changes in custom resources representing the desired state of the clusters and does the steps
to achieve this state in the Kubernetes by using the K8s client. It’s written in Java and
there is an overlap with the spark dependencies (logging, k8s client, apache-commons-*, fasterxml-jackson,
etc.). The operator contains also metadata that allows it to deploy smoothly using the operatorhub.io
[2]. For a very basic info, check the readme on the project page including the gif :) Other
unique feature to this operator is the ability (it’s optional) to compile itself to a native
image using GraalVM compiler to be able to start fast and have a very low memory footprint.
>
>
> We would like to contribute this project to Spark’s code base. It can’t be distributed
as a spark package, because it’s not a library that can be used from Spark environment.
So if you are interested, the directory under resource-managers/kubernetes/spark-operator/
could be a suitable destination.
>
>
> The current repository is radanalytics/spark-operator [2] on GitHub and it contains also
a test suite [3] that verifies if the operator can work well on K8s (using minikube) and also
on OpenShift. I am not sure how to transfer those tests in case you would be interested in
those as well.
>
>
> I’ve already opened the PR [5], but it got closed, so I am opening the discussion here
first. The PR contained old package names with our organisation called radanalytics.io but
we are willing to change that to anything that will be more aligned with the existing Spark
conventions, same holds for the license headers in all the source files.
>
>
> jk
>
>
>
> [1]: https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>
> [2]: https://operatorhub.io/operator/radanalytics-spark
>
> [3]: https://github.com/radanalyticsio/spark-operator
>
> [4]: https://travis-ci.org/radanalyticsio/spark-operator
>
> [5]: https://github.com/apache/spark/pull/26075

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message