spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject ASF board report for February 2021
Date Tue, 09 Feb 2021 03:54:32 GMT
It’s time to prepare our quarterly ASF board report, which we need to submit on Feb 10th.
The last one was in November. I’ve written a draft here, but let me know if you want to
add any more content that I’ve missed.


Apache Spark is a fast and general engine for large-scale data processing. It offers high-level
APIs in Java, Scala, Python, R and SQL as well as a rich set of libraries including stream
processing, machine learning, and graph analytics.

Project status:

- The community is close to finalizing the first Spark 3.1.x release, which will be Spark
3.1.1. There was a problem with our release candidate packaging scripts that caused us to
accidentally publish a 3.1.0 version to Maven Central before it was ready, so we’ve deleted
that and will not use that version number. Several release candidates for 3.1.1 have gone
out to the dev mailing list and we’re tracking the last remaining issues.

- Several proposals for significant new features are being discussed on the dev mailing list,
including a function catalog for Spark SQL, a RocksDB based state store for streaming applications,
and public APIs for creating user-defined types (UDTs) in Spark SQL. We would welcome feedback
on these from interested community members.


- No changes since the last report.

Latest releases:

- Spark 2.4.7 was released on September 12th, 2020.
- Spark 3.0.1 was released on September 8th, 2020.
- Spark 3.0.0 was released on June 18th, 2020.

Committers and PMC:

- The latest committers were added on July 14th, 2020 (Huaxin Gao, Jungtaek
 Lim and Dilip Biswal).
- The latest PMC member was added on Sept 4th, 2019 (Dongjoon Hyun). The PMC
 has been discussing some new PMC candidates.

To unsubscribe e-mail:

View raw message