www-announce mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sally Khudairi ...@apache.org>
Subject The Apache Software Foundation Announces Apache® Airflow™ as a Top-Level Project
Date Tue, 08 Jan 2019 11:00:52 GMT
[this announcement is available online at https://s.apache.org/LeeE ]

Open Source Big Data workflow management system in use at Adobe, Airbnb, Etsy, Google, ING,
Lyft, PayPal, Reddit, Square, Twitter, and United Airlines, among others.

Wakefield, MA —8 January 2019— The Apache Software Foundation (ASF), the all-volunteer
developers, stewards, and incubators of more than 350 Open Source projects and initiatives,
announced today Apache® Airflow™ as a Top-Level Project (TLP).

Apache Airflow is a flexible, scalable workflow automation and scheduling system for authoring
and managing Big Data processing pipelines of hundreds of petabytes. Graduation from the Apache
Incubator as a Top-Level Project signifies that the Apache Airflow community and products
have been well-governed under the ASF's meritocratic process and principles.

"Since its inception, Apache Airflow has quickly become the de-facto standard for workflow
orchestration," said Bolke de Bruin, Vice President of Apache Airflow. "Airflow has gained
adoption among developers and data scientists alike thanks to its focus on configuration-as-code.
That has gained us a community during incubation at the ASF that not only uses Apache Airflow
but also contributes back. This reflects Airflow’s ease of use, scalability, and power of
our diverse community; that it is embraced by enterprises and start-ups alike, allows us to
now graduate to a Top-Level Project."

Apache Airflow is used to easily orchestrate complex computational workflows. Through smart
scheduling, database and dependency management, error handling and logging, Airflow automates
resource management, from single servers to large-scale clusters. Written in Python, the project
is highly extensible and able to run tasks written in other languages, allowing integration
with commonly used architectures and projects such as AWS S3, Docker, Apache Hadoop HDFS,
Apache Hive, Kubernetes, MySQL, Postgres, Apache Zeppelin, and more. Airflow originated at
Airbnb in 2014 and was submitted to the Apache Incubator March 2016.

Apache Airflow is in use at more than 200 organizations, including Adobe, Airbnb, Astronomer,
Etsy, Google, ING, Lyft, NYC City Planning, Paypal, Polidea, Qubole, Quizlet, Reddit, Reply,
Solita, Square, Twitter, and United Airlines, among others. A list of known users can be found
at https://github.com/apache/incubator-airflow#who-uses-apache-airflow

"Adobe Experience Platform is built on cloud infrastructure leveraging open source technologies
such as Apache Spark, Kafka, Hadoop, Storm, and more," said Hitesh Shah, Principal Architect
of Adobe Experience Platform. "Apache Airflow is a great new addition to the ecosystem of
orchestration engines for Big Data processing pipelines. We have been leveraging Airflow for
various use cases in Adobe Experience Cloud and will soon be looking to share the results
of our experiments of running Airflow on Kubernetes." 

"Our clients just love Apache Airflow. Airflow has been a part of all our Data pipelines created
in past 2 years acting as the ring-master and taming our Machine Learning and ETL Pipelines,"
said Kaxil Naik, Data Engineer at Data Reply. "It has helped us create a Single View for our
client's entire data ecosystem. Airflow's Data-aware scheduling and error-handling helped
automate entire report generation process reliably without any human-intervention. It easily
integrates with Google Cloud (and other major cloud providers) as well and allows non-technical
personnel to use it without a steep learning curve because of Airflow’s configuration-as-a-code
paradigm."

"With over 250 PB of data under management, PayPal relies on workflow schedulers such as Apache
Airflow to manage its data movement needs reliably," said Sid Anand, Chief Data Engineer at
PayPal. "Additionally, Airflow is used for a range of system orchestration needs across many
of our distributed systems: needs include self-healing, autoscaling, and reliable [re-]provisioning."

"Since our offering of Apache Airflow as a service in Sept 2016, a lot of big and small enterprises
have successfully shifted all of their workflow needs to Airflow," said Sumit Maheshwari,
Engineering Manager at Qubole. "At Qubole, not only are we a provider, but also a big consumer
of Airflow as well. For example, our whole Insight and Recommendations platform is built around
Airflow only, where we process billions of events every month from hundreds of enterprises
and generate insights for them on big data solutions like Apache Hadoop, Apache Spark, and
Presto. We are very impressed by the simplicity of Airflow and ease at which it can be integrated
with other solutions like clouds, monitoring systems or various data sources."

"At ING, we use Apache Airflow to orchestrate our core processes, transforming billions of
records from across the globe each day," said Rob Keevil, Data Analytics Platform Lead at
ING WB Advanced Analytics. "Its feature set, Open Source heritage and extensibility make it
well suited to coordinate the wide variety of batch processes we operate, including ETL workflows,
model training, integration scripting, data integrity testing, and alerting. We have played
an active role in Airflow development from the onset, having submitted hundreds of pull requests
to ensure that the community benefits from the Airflow improvements created at ING.  We are
delighted to see Airflow graduate from the Apache Incubator, and look forward to see where
this exciting project will be taken in future!"

"We saw immediately the value of Apache Airflow as an orchestrator when we started contributing
and using it," said Jarek Potiuk, Principal Software Engineer at Polidea. "Being able to develop
and maintain the whole workflow by engineers is usually a challenge when you have a huge configuration
to maintain. Airflow allows your DevOps to have a lot of fun and still use the standard coding
tools to evolve your infrastructure. This is 'infrastructure as a code' at its best."

"Workflow orchestration is essential to the (big) data era that we live in," added de Bruin.
"The field is evolving quite fast and the new data thinking is just starting to make an impact.
Apache Airflow is a child of the data era and therefore very well positioned, and is also
young so a lot of development can still happen. Airflow can use bright minds from scientific
computing, enterprises, and start-ups to further improve it. Join the community, it is easy
to hop on!"

Availability and Oversight
Apache Airflow software is released under the Apache License v2.0 and is overseen by a self-selected
team of active contributors to the project. A Project Management Committee (PMC) guides the
Project's day-to-day operations, including community development and product releases. For
downloads, documentation, and ways to become involved with Apache Airflow, visit http://airflow.apache.org/
and https://twitter.com/ApacheAirflow

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source
projects, including Apache HTTP Server --the world's most popular Web server software. Through
the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members
and 7,000 Committers across six continents successfully collaborate to develop freely available
enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions
are distributed under the Apache License; and the community actively participates in ASF mailing
lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings,
and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations
and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg,
Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks,
Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private
Internet Access, Red Hat, Target, Tencent, and Union Investment. For more information, visit
http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Airflow", "Apache Airflow", and "ApacheCon"
are registered trademarks or trademarks of the Apache Software Foundation in the United States
and/or other countries. All other brands and trademarks are the property of their respective
owners.

# # #

NOTE: you are receiving this message because you are subscribed to the announce@apache.org
distribution list. To unsubscribe, send email from the recipient account to announce-unsubscribe@apache.org
with the word "Unsubscribe" in the subject line.

Mime
View raw message