spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: Kubernetes: why use init containers?
Date Wed, 10 Jan 2018 21:23:00 GMT
On Wed, Jan 10, 2018 at 1:10 PM, Matt Cheah <mcheah@palantir.com> wrote:
> I’d imagine this is a reason why YARN hasn’t went with using spark-submit from the
application master...

I wouldn't use YARN as a template to follow when writing a new
backend. A lot of the reason why the YARN backend works the way it
does is because of backwards compatibility. IMO it would be much
better to change the YARN backend to use spark-submit, because it
would immensely simplify the code there. It was a nightmare to get
YARN to reach feature parity with other backends because it has to
pretty much reimplement everything.

But doing that would break pretty much every Spark-on-YARN deployment,
so it's not something we can do right now.

For the other backends the situation is sort of similar; it probably
wouldn't be hard to change standalone's DriverWrapper to also use
spark-submit. But that brings potential side effects for existing
users that don't exist with spark-on-k8s, because spark-on-k8s is new
(the current fork aside).

>  But using init-containers makes it such that we don’t need to use spark-submit at
all

Those are actually separate concerns. There are a whole bunch of
things that spark-submit provides you that you'd have to replicate in
the k8s backend if not using it. Thinks like properly handling special
characters in arguments, native library paths, "userClassPathFirst",
etc. You get them almost for free with spark-submit, and using an init
container does not solve any of those for you.

I'd say that using spark-submit is really not up for discussion here;
it saves you from re-implementing a whole bunch of code that you
shouldn't even be trying to re-implement.

Separately, if there is a legitimate need for an init container, then
it can be added. But I don't see that legitimate need right now, so I
don't see what it's bringing other than complexity.

(And no, "the k8s documentation mentions that init containers are
sometimes used to download dependencies" is not a legitimate need.)

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message