airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ry Walker ...@rywalker.com>
Subject Re: [2.0 spring cleaning] Remove remote mode in Airflow CLI and in-core API Client
Date Thu, 13 Aug 2020 19:17:39 GMT
Yes I agree the new API would be made default, and experimental would be
disabled by default.

Before the new API goes live, we should allow minor improvements to the
Experimental API (for example https://github.com/apache/airflow/pull/10308,
which addresses to some degree the security hole in the experimental API
for users of Airflow RBAC) which could change your last sentence a little,
Kamil.


On Thu, Aug 13, 2020 at 1:54 PM Kamil Breguła <kamil.bregula@polidea.com>
wrote:

> I agree. We do not have to completely delete the experimental API in
> Airflow 2.0, but I think it is worth turning off by default so that the
> user has to make a conscious decision that they want to use the API, which
> provides a limited level of security-  no permission control, all
> authorized users have full power.
>
> On Thu, Aug 13, 2020 at 7:42 PM QP Hou <qph@scribd.com> wrote:
>
> > Yeah, i am also curious to know more about the reason why we want to nuke
> > the experimental api code soon instead of just marking it as deprecated.
> >
> > As for getting more insights into remote mode cli usage, would it make
> > sense to make it part of the airflow user survey?
> >
> > Thanks,
> > QP Hou
> >
> >
> > On Thu, Aug 13, 2020 at 7:18 AM Ry Walker <ry@rywalker.com> wrote:
> >
> > > I would think we would deprecate the old API once we say the new API is
> > > “ready to go” - and leave it in place a while as users transition to
> new
> > > API. Why is there an urgency to remove it from codebase?
> > >
> > > On Thu, Aug 13, 2020 at 5:46 AM Ash Berlin-Taylor <ash@apache.org>
> > wrote:
> > >
> > > > Removing the experimental is a fundamental breaking change to users'
> > > > workflows, and so we should remove it before 2.0.
> > > >
> > > > -ash
> > > >
> > > > On 13 August 2020 10:14:02 BST, Jarek Potiuk <
> Jarek.Potiuk@polidea.com
> > >
> > > > wrote:
> > > > >And I think we should make the whole experimenta API deprecated in
> > > > >1.10.12
> > > > >possibly ?
> > > > >
> > > > >On Thu, Aug 13, 2020 at 11:12 AM Jarek Potiuk
> > > > ><Jarek.Potiuk@polidea.com>
> > > > >wrote:
> > > > >
> > > > >> I think no matter what, Maybe we should simply make it deprecated
> in
> > > > >the
> > > > >> upcoming (today?) release of 1.10.12 ? Then we can decide if
in -
> > > > >> potential  - 1.10.13 we remove it or leave it as it is.
> > > > >>
> > > > >> On Wed, Aug 12, 2020 at 6:37 PM Kamil Breguła
> > > > ><kamil.bregula@polidea.com>
> > > > >> wrote:
> > > > >>
> > > > >>> I started this thread mainly to discuss what we want to do
with
> > this
> > > > >>> remote
> > > > >>> mode prior to the Airflow 2.0 release. This is mainly due
to the
> > > > >fact that
> > > > >>> he is using an experimental API which will be deprecated.
> > > > >>>
> > > > >>> In my opinion, we have several solutions.
> > > > >>> a) Delete this mode as unused and not supported.
> > > > >>> b) Rewrite in-core API client to support stable API
> > > > >>> c) Prepare OpenAPI based client and rewrite CLI to use it
> > > > >>> d) Leave as-is
> > > > >>>
> > > > >>> There were various expectations on the mailing list about
this
> > mode,
> > > > >but I
> > > > >>> haven't seen anyone actively contribute to it. This brings
me
> > > > >further
> > > > >>> questions. Does anyone use this mode in its current form?
If no
> one
> > > > >is
> > > > >>> using it, I think we can take more radical steps to start
with a
> > > > >blank
> > > > >>> page. This will make it easier to start work and be able
to
> iterate
> > > > >over
> > > > >>> it
> > > > >>> faster in the future. This looks like a simple task, but
if we
> want
> > > > >to be
> > > > >>> sure that no breaking-changes are made, we should pay off
some
> > > > >technical
> > > > >>> debt and increase testing coverage before we can think about
> making
> > > > >more
> > > > >>> changes. It may not be necessary if we choose a different
path of
> > > > >>> development.
> > > > >>>
> > > > >>> I think now is a good time to try to make some decisions
if
> someone
> > > > >is
> > > > >>> actually interested in developing these features. I do not
think
> we
> > > > >need a
> > > > >>> precise vision of the development, if currently this feature
is
> not
> > > > >used
> > > > >>> by
> > > > >>> anyone and no one is really interested in its development.
> > > > >>>
> > > > >>> On Wed, Aug 12, 2020 at 7:16 AM QP Hou <qph@scribd.com>
wrote:
> > > > >>>
> > > > >>> > I think it's best to divide the discussion into two
separate
> > > > >topics.
> > > > >>> >
> > > > >>> > First one is to replace the existing json_client with
the new
> to
> > > > >be
> > > > >>> created
> > > > >>> > official Airflow Python Client backed by the new RESTful
API.
> > This
> > > > >IMHO
> > > > >>> is
> > > > >>> > a must have considering we are to deprecate experimental
API
> > going
> > > > >>> forward.
> > > > >>> >
> > > > >>> > The second topic is to create a better CLI experience
> leveraging
> > > > >the new
> > > > >>> > APIs. This is much more controversial. I remember us
having a
> > > > >>> > similar discussion in the dev list a year ago, which
didn't get
> > > > >much
> > > > >>> > traction. It's not possible to fit all existing CLI
> > > > >functionalities into
> > > > >>> > REST APIs. DB utils use-case that Ash mentioned is a
great
> > > > >example. So I
> > > > >>> > think one potential solution is to split the CLI into
two. Keep
> > > > >the
> > > > >>> > existing CLI as the admin/management CLI can communicate
> directly
> > > > >with
> > > > >>> the
> > > > >>> > DB and taps into airflow core code base. On the other
hand, we
> > can
> > > > >>> create a
> > > > >>> > separate user facing CLI that's light weight, fast and
remote
> > > > >only. It
> > > > >>> > doesn't even need to be written in python to make it
easier to
> > > > >>> distribute
> > > > >>> > as a single binary.
> > > > >>> >
> > > > >>> > Thanks,
> > > > >>> > QP Hou
> > > > >>> >
> > > > >>> >
> > > > >>> > On Tue, Aug 11, 2020 at 12:44 PM Ash Berlin-Taylor
> > > > ><ash@apache.org>
> > > > >>> wrote:
> > > > >>> >
> > > > >>> > > -1 from me without a firm plan how we will replace
it.
> > > > >>> > >
> > > > >>> > > I see keeping it and extending to use the new API
would
> ensure
> > > > >that
> > > > >>> > > everything the CLI can do locally (i.e. when airflow
> webserver
> > > > >isn't
> > > > >>> up
> > > > >>> > > yet, with the ) also works over the API with the
exception of
> > db
> > > > >>> > utilities.
> > > > >>> > >
> > > > >>> > > -ash
> > > > >>> > >
> > > > >>> > > On 11 August 2020 20:05:56 BST, QP Hou <qph@scribd.com>
> wrote:
> > > > >>> > > >+1 for replacing the existing remote mode client
with the
> open
> > > > >api
> > > > >>> > > >based
> > > > >>> > > >client. IMO, we don't really have other options
here because
> > > > >the
> > > > >>> > > >experimental API will be deprecated in the
future.
> > > > >>> > > >
> > > > >>> > > >For OpenAPI based Airflow REST clients, the
current plan is
> to
> > > > >>> maintain
> > > > >>> > > >all
> > > > >>> > > >the code gen automation within the main source
tree [1],
> then
> > > > >use it
> > > > >>> to
> > > > >>> > > >populate each individual language specific
client repo like
> > the
> > > > >go
> > > > >>> > > >client
> > > > >>> > > >mentioned earlier. So far, we have the go client
completed
> and
> > > > >>> > > >validated to
> > > > >>> > > >make sure this development flow will meet our
needs. The
> next
> > > > >step I
> > > > >>> > > >think
> > > > >>> > > >the community should focus on is getting API
auth
> implemented
> > > > >[2]
> > > > >>> > > >before we
> > > > >>> > > >move on to generate the python client. How
we do API auth
> > could
> > > > >have
> > > > >>> a
> > > > >>> > > >big
> > > > >>> > > >impact on client code gen automation, so it
is worth waiting
> > > > >for.
> > > > >>> > > >
> > > > >>> > > >Once we have authentication implemented in
both Airflow core
> > > > >and
> > > > >>> > > >clients,
> > > > >>> > > >we should be all good to start doing version
releases for
> our
> > > > >API
> > > > >>> > > >clients.
> > > > >>> > > >
> > > > >>> > > >That said, adopting open api based clients
in the CLI alone
> > > > >won't
> > > > >>> > > >address
> > > > >>> > > >the issue of CLI depending on full airflow
installation.
> Some
> > > > >of the
> > > > >>> > > >cli
> > > > >>> > > >commands like `dags test` depend on a full
airflow
> > installation
> > > > >by
> > > > >>> > > >design.
> > > > >>> > > >We will have to either develop a separate CLI
intended for
> > > > >remote
> > > > >>> only
> > > > >>> > > >use
> > > > >>> > > >or add a flag in the existing cli so it can
run in a pure
> > > > >remote mode
> > > > >>> > > >where
> > > > >>> > > >it would disable loading of code that requires
airflow
> > > > >installation
> > > > >>> > > >entirely.
> > > > >>> > > >
> > > > >>> > > >[1]: https://github.com/apache/airflow/tree/master/clients,
> > > > >>> > > >[2]: https://github.com/apache/airflow/issues/8112
> > > > >>> > > >
> > > > >>> > > >On Tue, Aug 11, 2020 at 11:05 AM Kamil Breguła
> > > > >>> > > ><kamil.bregula@polidea.com>
> > > > >>> > > >wrote:
> > > > >>> > > >
> > > > >>> > > >> Hello,
> > > > >>> > > >>
> > > > >>> > > >> I think we should remove remote mode in
CLI and in-core
> API
> > > > >Client
> > > > >>> > > >> (airflow.api.client package).
> > > > >>> > > >> Here is docs about remote mode:
> > > > >>> > > >>
> > > > >>> > > >>
> > > > >>> > > >
> > > > >>> > >
> > > > >>> >
> > > > >>>
> > > > >
> > > >
> > >
> >
> https://airflow.readthedocs.io/en/latest/usage-cli.html#set-up-connection-to-a-remote-airflow-instance
> > > > >>> > > >>
> > > > >>> > > >> Since these features were introduced,
it has never been
> > > > >actively
> > > > >>> > > >developed
> > > > >>> > > >> and I don't think it's widely used. At
the same time,
> Apache
> > > > >>> Airflow
> > > > >>> > > >is
> > > > >>> > > >> evolving, and this code stands out more
and more from the
> > > > >rest.
> > > > >>> > > >>
> > > > >>> > > >> My main reservations about these features:
> > > > >>> > > >> - Remote mode/in-core API Client is rarely
used. I asked a
> > > > >few
> > > > >>> people
> > > > >>> > > >and
> > > > >>> > > >> none of them used it in production. Does
anyone use it?
> > > > >>> > > >> - A very small number of commands are
available (7 pools
> > > > >command
> > > > >>> and
> > > > >>> > > >2 dags
> > > > >>> > > >> command only)
> > > > >>> > > >> - Remote mode/API Client depends on experimental
REST API.
> > > > >>> > > >> - Remote mode/API Client is a handwritten
code that is
> > > > >difficult to
> > > > >>> > > >> maintain.
> > > > >>> > > >> - No documentation for API client
> > > > >>> > > >> - Remote mode/API Client has low test
coverage.
> > > > >>> > > >> - Remote mode does not provide a good
level of security,
> > > > >because it
> > > > >>> > > >depends
> > > > >>> > > >> on experimental API. There is the only
authentication, but
> > > > >the
> > > > >>> > > >> authenticated user can perform any operation.
> > > > >>> > > >> - Requires full Airflow to be installed
along with a large
> > > > >number
> > > > >>> of
> > > > >>> > > >> unnecessary dependencies. Some of them
are difficult to
> > > > >install in
> > > > >>> > > >some
> > > > >>> > > >> environments, e.g. setproctitle on Windows
> > > > >>> > > >> - Using this client API changes the logger
configuration
> > > > >because it
> > > > >>> > > >> requires importing the airflow package.
> > > > >>> > > >>
> > > > >>> > > >> I think this remote mode in CLI is something
valuable,
> but I
> > > > >think
> > > > >>> we
> > > > >>> > > >can
> > > > >>> > > >> do it in a different way in the future,
e.g. generate a
> > > > >CLI/API
> > > > >>> > > >Client
> > > > >>> > > >> based on the OpenAPI specification.
> > > > >>> > > >>
> > > > >>> > > >> Generated API clients can be installed
independently of
> > > > >airflow and
> > > > >>> > > >will be
> > > > >>> > > >> easier to maintain. We already have one
API client for
> > golang
> > > > >>> > > >implemented
> > > > >>> > > >> in this way, so new languages will only
be developing this
> > > > >idea.
> > > > >>> > > >> - https://github.com/apache/airflow-client-go
> > > > >>> > > >>
> > > > >>> > > >> I will be happy to discuss the vision
of the development
> of
> > > > >these
> > > > >>> two
> > > > >>> > > >> things. How do we want to develop these
two things?
> > > > >>> > > >>
> > > > >>> > > >> Best regards,
> > > > >>> > > >> Kamil Bregula
> > > > >>> > > >>
> > > > >>> > >
> > > > >>> >
> > > > >>>
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Jarek Potiuk
> > > > >> Polidea <https://www.polidea.com/> | Principal Software
Engineer
> > > > >>
> > > > >> M: +48 660 796 129 <+48660796129>
> > > > >> [image: Polidea] <https://www.polidea.com/>
> > > > >>
> > > > >>
> > > > >
> > > > >--
> > > > >
> > > > >Jarek Potiuk
> > > > >Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > > >
> > > > >M: +48 660 796 129 <+48660796129>
> > > > >[image: Polidea] <https://www.polidea.com/>
> > > >
> > > --
> > > Sent from Gmail Mobile
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message