airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Imberman <daniel.imber...@gmail.com>
Subject Re: [Discuss] Airflow Kubernetes worker configuration should be parsed from YAML
Date Tue, 12 Mar 2019 03:50:53 GMT
I agree there should be an option with more flexibility than the current
k8s pod operator. The design was a mixture of trying to introduce k8s to a
larger community by abstracting details and frankly my own freshness with
the technology (I had only been using it for 3 months when this effort
started).

I see two possibilities for a more flexible k8s operator: Either allowing
users to provide links to YAML files or allowing users to create pod
objects using the k8s python client. Both have their plusses and minuses
and would love to have this be a larger discussion among the k8s+airflow
community.

As far as a helm operator: We avoided helm at the time because it was
pretty clunky with the tiller architecture, but with helm 3 I'd be very
open to discussing further. I agree one big issue is that helm is created
with deployments in mind over single tasks, but that could also be worth a
discussion with the helm team.

Also hey @Grant Nicholas <grant.nicholas@nielsen.com> glad to see you
involved with the project again :).

On Wed, Mar 6, 2019 at 4:11 PM Grant Nicholas <
grantnicholas2015@u.northwestern.edu> wrote:

> The challenge with using yaml to define the pod spec is we need to inject
> values into the yaml in order for the pod to work properly.
>
> For example, if you try setting the command property, then the pod will not
> actually run the airflow command to start the task. Same idea with needed
> environmental variables, volume mounts, etc.
>
> This can be worked around by validating the yaml upfront against a
> black/white list of properties, but it would require some work.
>
>
>
>
> On Wed, Mar 6, 2019, 4:11 PM Kyle Hamlin <hamlin.kn@gmail.com> wrote:
>
> > Would be great if this also worked for KubernetesExecutor config. I made
> a
> > PR: https://github.com/apache/airflow/pull/4456 to add a
> > default_executor_config because it doesn't make much sense configuring
> > every operator with the same config. I think it would be much more
> > preferable to use YAML, still use the merge functionality, and be able to
> > have much more control over worker pod configuration maybe add additional
> > labels stuff like that.
> >
> > On Wed, Mar 6, 2019 at 1:14 PM Marwan Nabil <mrwanbaghdad76@gmail.com>
> > wrote:
> >
> > > Thanks for starting the discussion.
> > >
> > > I think it would be great. And a great first step would be to supply
> the
> > > filepath.
> > >
> > > A decorator approach would be suitable I think and would allow for
> > > extendability and would allow for much than yaml. I'm thinking helm
> > charts
> > > :D
> > >
> > > On 2019/03/06 14:58:03, da...@ssense.com <d...@ssense.com> wrote:
> > > > Hi,>
> > > >
> > > > I would like to discuss parsing YAML for the Kubernetes worker
> > > configuration instead of the current process of programmatically
> > generating
> > > the YAML from the Pod and PodRequest Factory as is done currently.>
> > > >
> > > > *Motivation:*>
> > > >
> > > > Kubernetes configuration is quite complex. Instead of using the
> > > configuration system that is offered natively by Kubernetes (YAML),the
> > > current method involves programmatically recreating this YAML file.
> Fully
> > > re-implementing the configuration in Airflow is taking a lot of time,
> and
> > > at the moment many features available through YAML configuration are
> not
> > > available in Airflow. Furthermore, as the Kubernetes API evolves, the
> > > Airflow codebase will have to change with it, and Airflow will be in a
> > > constant state of catching up with missing features available. This can
> > all
> > > be solved by simply parsing the YAML file.>
> > > >
> > > > *Idea:*>
> > > >
> > > > Either pass in the YAML as string or have a path to the YAML file.>
> > > >
> > >
> >
> >
> > --
> > Kyle Hamlin
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message