spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Graves <>
Subject Re: Spark 3.0 preview release 2?
Date Tue, 10 Dec 2019 14:24:27 GMT
 +1 for another preview
    On Monday, December 9, 2019, 12:32:29 AM CST, Xiao Li <> wrote:
I got many great feedbacks from the community about the recent 3.0 preview release. Since
the last 3.0 preview release, we already have 353 commits [].
There are various important features and behavior changes we want the community to try before
entering the official release candidates of Spark 3.0. 

Below is my selected items that are not part of the last 3.0 preview but already available
in the upstream master branch: 

   - Support JDK 11 with Hadoop 2.7
   - Spark SQL will respect its own default format (i.e., parquet) when users do CREATE TABLE
without USING or STORED AS clauses
   - Enable Parquet nested schema pruning and nested pruning on expressions by default
   - Add observable Metrics for Streaming queries
   - Column pruning through nondeterministic expressions
   - RecordBinaryComparator should check endianness when compared by long 
   - Improve parallelism for local shuffle reader in adaptive query execution
   - Upgrade Apache Arrow to version 0.15.1
   - Various interval-related SQL support
   - Add a mode to pin Python thread into JVM's
   - Provide option to clean up completed files in streaming query

I am wondering if we can have another preview release for Spark 3.0? This can help us find
the design/API defects as early as possible and avoid the significant delay of the upcoming
Spark 3.0 release

Also, any committer is willing to volunteer as the release manager of the next preview release
of Spark 3.0, if we have such a release? 


View raw message