www-announce mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sally Khudairi ...@apache.org>
Subject The Apache Software Foundation Announces Apache® Hadoop® v3.2.0
Date Wed, 23 Jan 2019 11:00:38 GMT
[this announcement is available online at https://s.apache.org/HK21 ]

Pioneering Open Source distributed enterprise framework powers US$166B Big Data ecosystem

Wakefield, MA —23 January 2019— The Apache Software Foundation (ASF), the all-volunteer
developers, stewards, and incubators of more than 350 Open Source projects and initiatives,
today announced Apache® Hadoop® v3.2.0, the latest version of the Open Source software framework
for reliable, scalable, distributed computing.

Now in its 11th year, Apache Hadoop is the foundation of the US$166B Big Data ecosystem (source:
IDC) by enabling data applications to run and be managed on large hardware clusters in a distributed
computing environment. "Apache Hadoop has been at the center of this big data transformation,
providing an ecosystem with tools for businesses to store and process data on a scale that
was unheard of several years ago," according to Accenture Technology Labs.

"This latest release unlocks the powerful feature set the Apache Hadoop community has been
working on for more than nine months," said Vinod Kumar Vavilapalli, Vice President of Apache
Hadoop. "It further diversifies the platform by building on the cloud connector enhancements
from Apache Hadoop 3.0.0 and opening it up for deep learning use-cases and long-running apps."

Apache Hadoop 3.2.0 highlights include:

 - ABFS Filesystem connector —supports the latest Azure Datalake Gen2 Storage;
 - Enhanced S3A connector —including better resilience to throttled AWS S3 and DynamoDB
IO;
 - Node Attributes Support in YARN —helps to tag multiple labels on the nodes based on its
attributes and supports placing the containers based on expression of these labels;
 - Storage Policy Satisfier  —supports HDFS (Hadoop Distributed File System) applications
to move the blocks between storage types as they set the storage policies on files/directories;

 - Hadoop Submarine —enables data engineers to easily develop, train and deploy deep learning
models (in TensorFlow) on very same Hadoop YARN cluster;
 - C++ HDFS client —helps to do async IO to HDFS which helps downstream projects such as
Apache ORC;
 - Upgrades for long running services —supports in-place seamless upgrades of long running
containers via YARN Native Service API (application program interface) and CLI (command-line
interface).

"This is one of the biggest releases in Apache Hadoop 3.x line which brings many new features
and over 1,000 changes," said Sunil Govindan, Apache Hadoop 3.2.0 release manager. "We are
pleased to announce that Apache Hadoop 3.2.0 is available to take your data management requirements
to the next level. Thanks to all our contributors who helped to make this release happen."

Apache Hadoop is widely deployed at numerous enterprises and institutions worldwide, such
as Adobe, Alibaba, Amazon Web Services, AOL, Apple, Capital One, Cloudera, Cornell University,
eBay, ESA Calvalus satellite mission, Facebook, foursquare, Google, Hortonworks, HP, Huawei,
Hulu, IBM, Intel, LinkedIn, Microsoft, Netflix, The New York Times, Rackspace, Rakuten, SAP,
Tencent, Teradata, Tesla Motors, Twitter, Uber, and Yahoo. The project maintains a list of
educational and production users, as well as companies that offer Hadoop-related services
at https://wiki.apache.org/hadoop/PoweredBy

Global Knowledge hails, "...the open-source Apache Hadoop platform changes the economics and
dynamics of large-scale data analytics due to its scalability, cost effectiveness, flexibility,
and built-in fault tolerance. It makes possible the massive parallel computing that today's
data analysis requires."

Hadoop is proven at scale: Netflix captures 500+B daily events using Apache Hadoop. Twitter
uses Apache Hadoop to handle 5B+ sessions a day in real time. Twitter’s 10,000+ node cluster
processes and analyzes more than a zettabyte of raw data through 200B+ tweets per year. Facebook’s
cluster of 4,000+ machines that store 300+ petabytes is augmented by 4 new petabytes of data
generated each day. Microsoft uses Apache Hadoop YARN to run the internal Cosmos data lake,
which operates over hundreds of thousands of nodes and manages billions of containers per
day.

Transparency Market Research recently reported that the global Hadoop market is anticipated
to rise at a staggering 29% CAGR with a market valuation of US$37.7B by the end of 2023.

Apache Hadoop remains one of the most active projects at the ASF: it ranks #1 for Apache project
repositories by code commits, and is the #5 repository by size (3,881,797 lines of code).

"The Apache Hadoop community continues to go from strength to strength in further driving
innovation in Big Data," added Vavilapalli. "We hope that developers, operators and users
leverage our latest release in fulfilling their data management needs."

Catch Apache Hadoop in action at the Strata conference, 25-28 March 2019 in San Francisco,
and dozens of Hadoop MeetUps held around the world, including on 30 January 2019 at LinkedIn
in Sunnyvale, California.

Availability and Oversight
Apache Hadoop software is released under the Apache License v2.0 and is overseen by a self-selected
team of active contributors to the project. A Project Management Committee (PMC) guides the
Project's day-to-day operations, including community development and product releases. For
downloads, documentation, and ways to become involved with Apache Hadoop, visit http://hadoop.apache.org/
and https://twitter.com/hadoop

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source
projects, including Apache HTTP Server --the world's most popular Web server software. Through
the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members
and 7,000 Committers across six continents successfully collaborate to develop freely available
enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions
are distributed under the Apache License; and the community actively participates in ASF mailing
lists, mentoring initiatives, and ApacheCon, the Foundation's official global conference series.
The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate
sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget
Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks,
Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private
Internet Access, Red Hat, Target, Tencent, and Union Investment. For more information, visit
http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", and "ApacheCon" are
registered trademarks or trademarks of the Apache Software Foundation in the United States
and/or other countries. All other brands and trademarks are the property of their respective
owners.

# # #

NOTE: you are receiving this message because you are subscribed to the announce@apache.org
distribution list. To unsubscribe, send email from the recipient account to announce-unsubscribe@apache.org
with the word "Unsubscribe" in the subject line.

Mime
View raw message