spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongjoon Hyun <dongjoon.h...@gmail.com>
Subject Re: Spark 3.0 preview release 2?
Date Mon, 09 Dec 2019 18:14:24 GMT
Thank you, All.

+1 for another `3.0-preview`.

Also, thank you Yuming for volunteering for that!

Bests,
Dongjoon.


On Mon, Dec 9, 2019 at 9:39 AM Xiao Li <lixiao@databricks.com> wrote:

> When entering the official release candidates, the new features have to be
> disabled or even reverted [if the conf is not available] if the fixes are
> not trivial; otherwise, we might need 10+ RCs to make the final release.
> The new features should not block the release based on the previous
> discussions.
>
> I agree we should have code freeze at the beginning of 2020. The preview
> releases should not block the official releases. The preview is just to
> collect more feedback about these new features or behavior changes.
>
> Also, for the release of Spark 3.0, we still need the Hive community to do
> us a favor to release 2.3.7 for having HIVE-22190
> <https://issues.apache.org/jira/browse/HIVE-22190>. Before asking Hive
> community to do 2.3.7 release, if possible, we want our Spark community to
> have more tries, especially the support of JDK 11 on Hadoop 2.7 and 3.2,
> which is based on Hive 2.3 execution JAR. During the preview stage, we
> might find more issues that are not covered by our test cases.
>
>
>
> On Mon, Dec 9, 2019 at 4:55 AM Sean Owen <srowen@gmail.com> wrote:
>
>> Seems fine to me of course. Honestly that wouldn't be a bad result for
>> a release candidate, though we would probably roll another one now.
>> How about simply moving to a release candidate? If not now then at
>> least move to code freeze from the start of 2020. There is also some
>> downside in pushing out the 3.0 release further with previews.
>>
>> On Mon, Dec 9, 2019 at 12:32 AM Xiao Li <gatorsmile@gmail.com> wrote:
>> >
>> > I got many great feedbacks from the community about the recent 3.0
>> preview release. Since the last 3.0 preview release, we already have 353
>> commits [https://github.com/apache/spark/compare/v3.0.0-preview...master].
>> There are various important features and behavior changes we want the
>> community to try before entering the official release candidates of Spark
>> 3.0.
>> >
>> >
>> > Below is my selected items that are not part of the last 3.0 preview
>> but already available in the upstream master branch:
>> >
>> > Support JDK 11 with Hadoop 2.7
>> > Spark SQL will respect its own default format (i.e., parquet) when
>> users do CREATE TABLE without USING or STORED AS clauses
>> > Enable Parquet nested schema pruning and nested pruning on expressions
>> by default
>> > Add observable Metrics for Streaming queries
>> > Column pruning through nondeterministic expressions
>> > RecordBinaryComparator should check endianness when compared by long
>> > Improve parallelism for local shuffle reader in adaptive query execution
>> > Upgrade Apache Arrow to version 0.15.1
>> > Various interval-related SQL support
>> > Add a mode to pin Python thread into JVM's
>> > Provide option to clean up completed files in streaming query
>> >
>> > I am wondering if we can have another preview release for Spark 3.0?
>> This can help us find the design/API defects as early as possible and avoid
>> the significant delay of the upcoming Spark 3.0 release
>> >
>> >
>> > Also, any committer is willing to volunteer as the release manager of
>> the next preview release of Spark 3.0, if we have such a release?
>> >
>> >
>> > Cheers,
>> >
>> >
>> > Xiao
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>
>
> --
> [image: Databricks Summit - Watch the talks]
> <https://databricks.com/sparkaisummit/north-america>
>

Mime
View raw message