spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anirudh Ramanathan <ramanath...@google.com.INVALID>
Subject Re: Kubernetes backend and docker images
Date Mon, 08 Jan 2018 17:47:22 GMT
+matt +tim
For reference - here's our previous thread on this dockerfile unification
problem - https://github.com/apache-spark-on-k8s/spark/pull/60
I think this approach should be acceptable from both the customization and
visibility perspectives.


On Mon, Jan 8, 2018 at 9:40 AM, Anirudh Ramanathan <ramanathana@google.com>
wrote:

> +1
>
> We discussed some alternatives early on - including using a single
> dockerfile and different spec.container.command and spec.container.args
> from the Kubernetes driver/executor specification (which override
> entrypoint in docker). No reason that won't work also - except that it
> reduced the transparency of what was being invoked in the
> driver/executor/init by hiding it in the actual backend code.
>
> Putting it into a single entrypoint file and branching let's us realize
> the best of both worlds I think. This is an elegant solution, thanks
> Marcelo.
>
> On Jan 6, 2018 10:01 AM, "Felix Cheung" <felixcheung_m@hotmail.com> wrote:
>
>> +1
>>
>> Thanks for taking on this.
>> That was my feedback on one of the long comment thread as well, I think
>> we should have one docker image instead of 3 (also pending in the fork are
>> python and R variant, we should consider having one that we official
>> release instead of 9, for example)
>>
>>
>> ------------------------------
>> *From:* 蒋星博 <jiangxb1987@gmail.com>
>> *Sent:* Friday, January 5, 2018 10:57:53 PM
>> *To:* Marcelo Vanzin
>> *Cc:* dev
>> *Subject:* Re: Kubernetes backend and docker images
>>
>> Agree it should be nice to have this simplification, and users can still
>> create their custom images by copy/modifying the default one.
>> Thanks for bring this out Marcelo!
>>
>> 2018-01-05 17:06 GMT-08:00 Marcelo Vanzin <vanzin@cloudera.com>:
>>
>>> Hey all, especially those working on the k8s stuff.
>>>
>>> Currently we have 3 docker images that need to be built and provided
>>> by the user when starting a Spark app: driver, executor, and init
>>> container.
>>>
>>> When the initial review went by, I asked why do we need 3, and I was
>>> told that's because they have different entry points. That never
>>> really convinced me, but well, everybody wanted to get things in to
>>> get the ball rolling.
>>>
>>> But I still think that's not the best way to go. I did some pretty
>>> simple hacking and got things to work with a single image:
>>>
>>> https://github.com/vanzin/spark/commit/k8s-img
>>>
>>> Is there a reason why that approach would not work? You could still
>>> create separate images for driver and executor if wanted, but there's
>>> no reason I can see why we should need 3 images for the simple case.
>>>
>>> Note that the code there can be cleaned up still, and I don't love the
>>> idea of using env variables to propagate arguments to the container,
>>> but that works for now.
>>>
>>> --
>>> Marcelo
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>
>>>
>>


-- 
Anirudh Ramanathan

Mime
View raw message