airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamil Breguła <kamil.breg...@polidea.com>
Subject Re: [2.0 spring cleaning] Remove remote mode in Airflow CLI and in-core API Client
Date Wed, 12 Aug 2020 16:37:25 GMT
I started this thread mainly to discuss what we want to do with this remote
mode prior to the Airflow 2.0 release. This is mainly due to the fact that
he is using an experimental API which will be deprecated.

In my opinion, we have several solutions.
a) Delete this mode as unused and not supported.
b) Rewrite in-core API client to support stable API
c) Prepare OpenAPI based client and rewrite CLI to use it
d) Leave as-is

There were various expectations on the mailing list about this mode, but I
haven't seen anyone actively contribute to it. This brings me further
questions. Does anyone use this mode in its current form? If no one is
using it, I think we can take more radical steps to start with a blank
page. This will make it easier to start work and be able to iterate over it
faster in the future. This looks like a simple task, but if we want to be
sure that no breaking-changes are made, we should pay off some technical
debt and increase testing coverage before we can think about making more
changes. It may not be necessary if we choose a different path of
development.

I think now is a good time to try to make some decisions if someone is
actually interested in developing these features. I do not think we need a
precise vision of the development, if currently this feature is not used by
anyone and no one is really interested in its development.

On Wed, Aug 12, 2020 at 7:16 AM QP Hou <qph@scribd.com> wrote:

> I think it's best to divide the discussion into two separate topics.
>
> First one is to replace the existing json_client with the new to be created
> official Airflow Python Client backed by the new RESTful API. This IMHO is
> a must have considering we are to deprecate experimental API going forward.
>
> The second topic is to create a better CLI experience leveraging the new
> APIs. This is much more controversial. I remember us having a
> similar discussion in the dev list a year ago, which didn't get much
> traction. It's not possible to fit all existing CLI functionalities into
> REST APIs. DB utils use-case that Ash mentioned is a great example. So I
> think one potential solution is to split the CLI into two. Keep the
> existing CLI as the admin/management CLI can communicate directly with the
> DB and taps into airflow core code base. On the other hand, we can create a
> separate user facing CLI that's light weight, fast and remote only. It
> doesn't even need to be written in python to make it easier to distribute
> as a single binary.
>
> Thanks,
> QP Hou
>
>
> On Tue, Aug 11, 2020 at 12:44 PM Ash Berlin-Taylor <ash@apache.org> wrote:
>
> > -1 from me without a firm plan how we will replace it.
> >
> > I see keeping it and extending to use the new API would ensure that
> > everything the CLI can do locally (i.e. when airflow webserver isn't up
> > yet, with the ) also works over the API with the exception of db
> utilities.
> >
> > -ash
> >
> > On 11 August 2020 20:05:56 BST, QP Hou <qph@scribd.com> wrote:
> > >+1 for replacing the existing remote mode client with the open api
> > >based
> > >client. IMO, we don't really have other options here because the
> > >experimental API will be deprecated in the future.
> > >
> > >For OpenAPI based Airflow REST clients, the current plan is to maintain
> > >all
> > >the code gen automation within the main source tree [1], then use it to
> > >populate each individual language specific client repo like the go
> > >client
> > >mentioned earlier. So far, we have the go client completed and
> > >validated to
> > >make sure this development flow will meet our needs. The next step I
> > >think
> > >the community should focus on is getting API auth implemented [2]
> > >before we
> > >move on to generate the python client. How we do API auth could have a
> > >big
> > >impact on client code gen automation, so it is worth waiting for.
> > >
> > >Once we have authentication implemented in both Airflow core and
> > >clients,
> > >we should be all good to start doing version releases for our API
> > >clients.
> > >
> > >That said, adopting open api based clients in the CLI alone won't
> > >address
> > >the issue of CLI depending on full airflow installation. Some of the
> > >cli
> > >commands like `dags test` depend on a full airflow installation by
> > >design.
> > >We will have to either develop a separate CLI intended for remote only
> > >use
> > >or add a flag in the existing cli so it can run in a pure remote mode
> > >where
> > >it would disable loading of code that requires airflow installation
> > >entirely.
> > >
> > >[1]: https://github.com/apache/airflow/tree/master/clients,
> > >[2]: https://github.com/apache/airflow/issues/8112
> > >
> > >On Tue, Aug 11, 2020 at 11:05 AM Kamil Breguła
> > ><kamil.bregula@polidea.com>
> > >wrote:
> > >
> > >> Hello,
> > >>
> > >> I think we should remove remote mode in CLI and in-core API Client
> > >> (airflow.api.client package).
> > >> Here is docs about remote mode:
> > >>
> > >>
> > >
> >
> https://airflow.readthedocs.io/en/latest/usage-cli.html#set-up-connection-to-a-remote-airflow-instance
> > >>
> > >> Since these features were introduced, it has never been actively
> > >developed
> > >> and I don't think it's widely used. At the same time, Apache Airflow
> > >is
> > >> evolving, and this code stands out more and more from the rest.
> > >>
> > >> My main reservations about these features:
> > >> - Remote mode/in-core API Client is rarely used. I asked a few people
> > >and
> > >> none of them used it in production. Does anyone use it?
> > >> - A very small number of commands are available (7 pools command and
> > >2 dags
> > >> command only)
> > >> - Remote mode/API Client depends on experimental REST API.
> > >> - Remote mode/API Client is a handwritten code that is difficult to
> > >> maintain.
> > >> - No documentation for API client
> > >> - Remote mode/API Client has low test coverage.
> > >> - Remote mode does not provide a good level of security, because it
> > >depends
> > >> on experimental API. There is the only authentication, but the
> > >> authenticated user can perform any operation.
> > >> - Requires full Airflow to be installed along with a large number of
> > >> unnecessary dependencies. Some of them are difficult to install in
> > >some
> > >> environments, e.g. setproctitle on Windows
> > >> - Using this client API changes the logger configuration because it
> > >> requires importing the airflow package.
> > >>
> > >> I think this remote mode in CLI is something valuable, but I think we
> > >can
> > >> do it in a different way in the future, e.g. generate a CLI/API
> > >Client
> > >> based on the OpenAPI specification.
> > >>
> > >> Generated API clients can be installed independently of airflow and
> > >will be
> > >> easier to maintain. We already have one API client for golang
> > >implemented
> > >> in this way, so new languages will only be developing this idea.
> > >> - https://github.com/apache/airflow-client-go
> > >>
> > >> I will be happy to discuss the vision of the development of these two
> > >> things. How do we want to develop these two things?
> > >>
> > >> Best regards,
> > >> Kamil Bregula
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message