www-announce mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sally Khudairi ...@apache.org>
Subject The Apache Software Foundation Announces Apache® Beam™ as a Top-Level Project
Date Tue, 10 Jan 2017 10:53:54 GMT
[this announcement is available online at https://s.apache.org/u67z ]

Unified programming model for batch and streaming Big Data processing, handling data of any
scale, and providing portability across multiple execution engines and environments.  
Forest Hill, MD —10 January 2017— The Apache Software Foundation (ASF), the all-volunteer
developers, stewards, and incubators of more than 350 Open Source projects and initiatives,
announced today that Apache® Beam™ has graduated from the Apache Incubator to become a
Top-Level Project (TLP), signifying that the project's community and products have been well-governed
under the ASF's meritocratic process and principles.

Apache Beam is a unified programming model for both batch and streaming data processing. It
includes software development kits in Java and Python for defining the data processing pipelines,
as well as runners to execute them on several execution engines, including Apache Apex, Apache
Flink, Apache Spark, and Google Cloud Dataflow.

"Graduation is an exciting milestone for Apache Beam," said Davor Bonaci, Vice President of
Apache Beam. "Becoming a top-level project is a recognition of the amazing growth of the Apache
Beam community, both in terms of size and diversity. Together we are pushing forward the state
of the art in distributed data processing and, at the same time, enhancing the ability to
interconnect additional storage/messaging systems and execution engines."

The technology behind Apache Beam evolved in large part from Google's internal work on data
processing, tracing its roots all the way back to the Google's initial MapReduce system and
its fundamental changes to the science of distributed data processing. It also reflects modern
advances in data processing, embodied in Google's FlumeJava and MillWheel systems, and culminating
with the unified programming model of Google Cloud Dataflow, which became the heart of Apache

This unified programming model can easily and intuitively express data processing pipelines
for everything from simple batch-based data ingestion to complex event-time-based stream processing.
The abstractions in the model are designed to support efficient parallel execution, while
also cleanly separating the user's processing logic from details of the underlying engine.

Raising the level of abstraction allows a single Apache Beam pipeline to run, without modification,
on multiple execution engines. This portability across diverse execution engines is just one
of many extensibility points that let Apache Beam integrate with the broader Apache and Big
Data ecosystems. Beside runners, developers can already easily add support for additional
IO connectors, libraries of transformations, SDKs, and even domain-specific extensions.

"Apache Beam helps us make stream processing accessible to a broad audience of data engineers,
by offering an API which is comprehensive, easy to reason about and at the same time fully
decoupled from the underlying execution engine," said Assaf Pinhasi, Director of Big Data
Platform at PayPal. "Our data engineers can now focus on what they do best – i.e. express
their processing pipelines easily, and not have to worry about how these get translated to
the complex underlying engine they run on."

"The graduation of Apache Beam as a top-level project is a great achievement and, in the fast-paced
Big Data world we live in, recognition of the importance of a unified, portable, and extensible
abstraction framework to build complex batch and streaming data processing pipelines," said
Laurent Bride, Chief Technology Officer at Talend. "Customers don't like to be locked-in,
so they will appreciate the runtime flexibility Apache Beam provides. With four mature runners
already available and I'm sure more to come, Beam represents the future and will be a key
element of Talend's strategic technology stack moving forward."

"We applaud the Apache Beam working group for its success in creating a unified and consistent
platform for building portable data processing pipelines," said Fausto Ibarra, Director of
Product Management, Google Cloud Platform. "We believe that we all have a responsibility to
share what we're learning, and we are proud and delighted to witness the successful collaboration
to build not only a powerful programming model for processing data from bounded and unbounded
sources, but also a portability layer for running pipelines on many processing engines, including
Apache Spark, Apache Flink, Apache Apex, and Google Cloud Dataflow. Apache Beam's graduation
to Top Level Project is a well-deserved recognition for the individuals and companies who
contributed to the project."

"Apache Beam represents a principled approach for analyzing data streams, simplifying a range
of complex data processing concepts and providing developers with a flexible, straightforward
model," said Kostas Tzoumas, Co-founder and Chief Executive Officer at data Artisans. "The
Apache Flink community wrote one of the first Beam runners, and those of us at data Artisans
has been contributing to the Beam project since its inception."

"The Apache Beam community has quickly adapted the Apache Way and been very welcoming to new
contributors and ideas. It also encourages communication across other projects that collaborate
under the Beam umbrella," said Thomas Weise, Vice President of Apache Apex, and Chief Technology
Officer/Co-Founder of Atrato. "Beam helps the wider ecosystem by establishing common terminology
and well thought through concepts that reflect in multiple runners and even the native API
of the underlying engines."

"In my work at Apache, I have rarely seen an incubating project build a community as well
as the Apache Beam project has done," said Ted Dunning, Vice President of Apache Incubator,
and Chief Application Architect at MapR Technologies. "The way that they have been able to
complement and enhance other streaming data projects is really a credit to everyone involved."

"We'd like to invite you to consider joining us on this exciting ride, whether as a user or
a contributor, as we work towards our first release with API stability," added Bonaci. "If
you'd like to try out Apache Beam today, check out the latest 0.4.0 release. We welcome contribution
and participation from anyone through our mailing lists, issue tracker, pull requests, and
Catch Apache Beam in action at numerous face-to-face meetups and conferences, including Apache:
Big Data North America 2017, DataWorks Summit and Hadoop Summit Munich 2017, Strata + Hadoop
World San Jose and London 2017.

Availability and Oversight
Apache Beam software is released under the Apache License v2.0 and is overseen by a self-selected
team of active contributors to the project. A Project Management Committee (PMC) guides the
Project's day-to-day operations, including community development and product releases. For
project updates, downloads, documentation, and ways to become involved with Apache Beam, visit
https://beam.apache.org/ and @ApacheBeam.

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of
the efforts at The Apache Software Foundation. All code donations from external organizations
and existing external projects wishing to join the ASF enter through the Incubator to: 1)
ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities
that adhere to our guiding principles. Incubation is required of all newly accepted projects
until a further review indicates that the infrastructure, communications, and decision making
process have stabilized in a manner consistent with other successful ASF projects. While incubation
status is not necessarily a reflection of the completeness or stability of the code, it does
indicate that the project has yet to be fully endorsed by the ASF. For more information, visit

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source
projects, including Apache HTTP Server --the world's most popular Web server software. Through
the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members
and 5,900 Committers successfully collaborate to develop freely available enterprise-grade
software, benefiting millions of users worldwide: thousands of software solutions are distributed
under the Apache License; and the community actively participates in ASF mailing lists, mentoring
initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo.
The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate
sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash
Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM,
InMotion Hosting, iSigma, LeaseWeb, Microsoft, OPDi, PhoenixNAP, Pivotal, Private Internet
Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information,
visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Beam", "Apache Beam", "Apache Apex", "Apex",
"Apache Flink", "Flink", "Apache Spark", "Spark", and "ApacheCon" are registered trademarks
or trademarks of the Apache Software Foundation in the United States and/or other countries.
All other brands and trademarks are the property of their respective owners.

# # #

NOTE: you are receiving this message because you are subscribed to the announce@apache.org
distribution list. To unsubscribe, send email from the recipient account to announce-unsubscribe@apache.org
with the word "Unsubscribe" in the subject line. 

View raw message