spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Heuer <heue...@gmail.com>
Subject Re: Spark 2.4.5 release for Parquet and Avro dependency updates?
Date Fri, 22 Nov 2019 18:45:16 GMT
Hello,

I am sorry for asking a somewhat inappropriate question.

For context, our projects depend on a fix in Parquet master but not yet released.  Parquet
1.11.0 is in release-candidate phase.  It looks like we can't build against Parquet 1.11.0
RC to include the fix and run successfully on Spark 2.4.x, which includes 1.10.1, without
various classpath workarounds.

I see now that Spark policy requires the Avro upgrade to wait until Spark 3.0, and since Parquet
1.11.0 RC currently depends on Avro 1.9.1, it may also have to wait.  I'll continue to think
on this in the scope of the Parquet community.

Thank you for the clarification,

   michael


> On Nov 22, 2019, at 12:07 PM, Dongjoon Hyun <dongjoon.hyun@gmail.com> wrote:
> 
> Hi, Michael.
> 
> I'm not sure Apache Spark is in the status close to what you want.
> 
> First, both Apache Spark 3.0.0-preview and Apache Spark 2.4 is using Avro 1.8.2. Also,
`master` and `branch-2.4` branch does. Cutting new releases do not provide you what you want.

> 
> Do we have a PR on the master branch? Otherwise, before starting to discuss the releases,
could you make a PR first on the master branch? For Parquet, it's the same.
> 
> Second, we want to provide Apache Spark 3.0.0 as compatible as possible. The incompatible
change could be a reason for rejection even in `master` branch for Apache Spark 3.0.0.
> 
> Lastly, we may consider backporting if it lands at `master` branch for 3.0.
> However, as Nan Zhu said, the dependency upgrade backporting PR is -1 by default. Usually,
it's allowed only for those serious cases like security/production outage.
> 
> Bests,
> Dongjoon.
> 
> 
> On Fri, Nov 22, 2019 at 9:00 AM Ryan Blue <rblue@netflix.com.invalid> wrote:
> Just to clarify, I don't think that Parquet 1.10.1 to 1.11.0 is a runtime-incompatible
change. The example mixed 1.11.0 and 1.10.1 in the same execution.
> 
> Michael, please be more careful about announcing compatibility problems in other communities.
If you've observed problems, let's find out the root cause first.
> 
> rb
> 
> On Fri, Nov 22, 2019 at 8:56 AM Michael Heuer <heuermh@gmail.com <mailto:heuermh@gmail.com>>
wrote:
> Hello,
> 
> Avro 1.8.2 to 1.9.1 is a binary incompatible update, and it appears that Parquet 1.10.1
to 1.11 will be a runtime-incompatible update (see thread on dev@parquet <https://mail-archives.apache.org/mod_mbox/parquet-dev/201911.mbox/%3C8357699C-9295-4EB0-A39E-B3538D71795B@gmail.com%3E>).
> 
> Might there be any desire to cut a Spark 2.4.5 release so that users can pick up these
changes independently of all the other changes in Spark 3.0?
> 
> Thank you in advance,
> 
>    michael
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix


Mime
View raw message