spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongjoon Hyun <dongjoon.h...@gmail.com>
Subject Re: [DISCUSS] Spark 2.5 release
Date Tue, 24 Sep 2019 01:47:31 GMT
Hi, Ryan.

This thread has many replied as you see. That is the evidence that the
community is interested in your suggestion a lot.

> I'm offering to help build a stable release without breaking changes. But
if there is no community interest in it, I'm happy to drop this.

In this thread, the root cause of the disagreement is due to the lack of
supporting evidence for your claims.

1. Is DSv2 stable in `master`?
2. If then, what subset of DSv2 patches does Ryan is suggesting backporting?
3. How much those backporting DSv2 patches looks differently in
`branch-2.4`?
4. What does he mean by `without breaking changes? Is it technically
feasible?
    Apache Spark 2.4.x and 2.5.x DSv2 should be compatible. (Not between
2.5.x DSv2 and 3.0.0 DSv2)
5. How long does it take? Is it possible before 3.0.0-preview? Who will
work on that backporting?
6. Is this meaningful if 2.5 and 3.1 become different again too soon (in
2020 Summer)?

We are SW engineers.
If you have a working branch, please share with us.
It will help us understand your suggestion and this discussion.
We can help you verify that branch achieves your goal.
The branch is tested already, isn't it?

Bests,
Dongjoon.




On Mon, Sep 23, 2019 at 10:44 AM Holden Karau <holden@pigscanfly.ca> wrote:

> I would personally love to see us provide a gentle migration path to Spark
> 3 especially if much of the work is already going to happen anyways.
>
> Maybe giving it a different name (eg something like
> Spark-2-to-3-transitional) would make it more clear about its intended
> purpose and encourage folks to move to 3 when they can?
>
> On Mon, Sep 23, 2019 at 9:17 AM Ryan Blue <rblue@netflix.com.invalid>
> wrote:
>
>> My understanding is that 3.0-preview is not going to be a
>> production-ready release. For those of us that have been using backports of
>> DSv2 in production, that doesn't help.
>>
>> It also doesn't help as a stepping stone because users would need to
>> handle all of the incompatible changes in 3.0. Using 3.0-preview would be
>> an unstable release with breaking changes instead of a stable release
>> without the breaking changes.
>>
>> I'm offering to help build a stable release without breaking changes. But
>> if there is no community interest in it, I'm happy to drop this.
>>
>> On Sun, Sep 22, 2019 at 6:39 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:
>>
>>> +1 for Matei's as well.
>>>
>>> On Sun, 22 Sep 2019, 14:59 Marco Gaido, <marcogaido91@gmail.com> wrote:
>>>
>>>> I agree with Matei too.
>>>>
>>>> Thanks,
>>>> Marco
>>>>
>>>> Il giorno dom 22 set 2019 alle ore 03:44 Dongjoon Hyun <
>>>> dongjoon.hyun@gmail.com> ha scritto:
>>>>
>>>>> +1 for Matei's suggestion!
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>> On Sat, Sep 21, 2019 at 5:44 PM Matei Zaharia <matei.zaharia@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> If the goal is to get people to try the DSv2 API and build DSv2 data
>>>>>> sources, can we recommend the 3.0-preview release for this? That
would get
>>>>>> people shifting to 3.0 faster, which is probably better overall compared
to
>>>>>> maintaining two major versions. There’s not that much else changing
in 3.0
>>>>>> if you already want to update your Java version.
>>>>>>
>>>>>> On Sep 21, 2019, at 2:45 PM, Ryan Blue <rblue@netflix.com.INVALID>
>>>>>> wrote:
>>>>>>
>>>>>> > If you insist we shouldn't change the unstable temporary API
in 3.x
>>>>>> . . .
>>>>>>
>>>>>> Not what I'm saying at all. I said we should carefully
>>>>>> consider whether a breaking change is the right decision in the 3.x
line.
>>>>>>
>>>>>> All I'm suggesting is that we can make a 2.5 release with the feature
>>>>>> and an API that is the same as the one in 3.0.
>>>>>>
>>>>>> > I also don't get this backporting a giant feature to 2.x line
>>>>>>
>>>>>> I am planning to do this so we can use DSv2 before 3.0 is released.
>>>>>> Then we can have a source implementation that works in both 2.x and
3.0 to
>>>>>> make the transition easier. Since I'm already doing the work, I'm
offering
>>>>>> to share it with the community.
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 21, 2019 at 2:36 PM Reynold Xin <rxin@databricks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Because for example we'd need to move the location of InternalRow,
>>>>>>> breaking the package name. If you insist we shouldn't change
the unstable
>>>>>>> temporary API in 3.x to maintain compatibility with 3.0, which
is totally
>>>>>>> different from my understanding of the situation when you exposed
it, then
>>>>>>> I'd say we should gate 3.0 on having a stable row interface.
>>>>>>>
>>>>>>> I also don't get this backporting a giant feature to 2.x line
... as
>>>>>>> suggested by others in the thread, DSv2 would be one of the main
reasons
>>>>>>> people upgrade to 3.0. What's so special about DSv2 that we are
doing this?
>>>>>>> Why not abandoning 3.0 entirely and backport all the features
to 2.x?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 21, 2019 at 2:31 PM, Ryan Blue <rblue@netflix.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Why would that require an incompatible change?
>>>>>>>>
>>>>>>>> We *could* make an incompatible change and remove support
for
>>>>>>>> InternalRow, but I think we would want to carefully consider
whether that
>>>>>>>> is the right decision. And in any case, we would be able
to keep 2.5 and
>>>>>>>> 3.0 compatible, which is the main goal.
>>>>>>>>
>>>>>>>> On Sat, Sep 21, 2019 at 2:28 PM Reynold Xin <rxin@databricks.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> How would you not make incompatible changes in 3.x? As
discussed
>>>>>>>>> the InternalRow API is not stable and needs to change.
>>>>>>>>>
>>>>>>>>> On Sat, Sep 21, 2019 at 2:27 PM Ryan Blue <rblue@netflix.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> > Making downstream to diverge their implementation
heavily
>>>>>>>>>> between minor versions (say, 2.4 vs 2.5) wouldn't
be a good experience
>>>>>>>>>>
>>>>>>>>>> You're right that the API has been evolving in the
2.x line. But,
>>>>>>>>>> it is now reasonably stable with respect to the current
feature set and we
>>>>>>>>>> should not need to break compatibility in the 3.x
line. Because we have
>>>>>>>>>> reached our goals for the 3.0 release, we can backport
at least those
>>>>>>>>>> features to 2.x and confidently have an API that
works in both a 2.x
>>>>>>>>>> release and is compatible with 3.0, if not 3.1 and
later releases as well.
>>>>>>>>>>
>>>>>>>>>> > I'd rather say preparation of Spark 2.5 should
be started after
>>>>>>>>>> Spark 3.0 is officially released
>>>>>>>>>>
>>>>>>>>>> The reason I'm suggesting this is that I'm already
going to do
>>>>>>>>>> the work to backport the 3.0 release features to
2.4. I've been asked by
>>>>>>>>>> several people when DSv2 will be released, so I know
there is a lot of
>>>>>>>>>> interest in making this available sooner than 3.0.
If I'm already doing the
>>>>>>>>>> work, then I'd be happy to share that with the community.
>>>>>>>>>>
>>>>>>>>>> I don't see why 2.5 and 3.0 are mutually exclusive.
We can work
>>>>>>>>>> on 2.5 while preparing the 3.0 preview and fixing
bugs. For DSv2, the work
>>>>>>>>>> is about complete so we can easily release the same
set of features and API
>>>>>>>>>> in 2.5 and 3.0.
>>>>>>>>>>
>>>>>>>>>> If we decide for some reason to wait until after
3.0 is released,
>>>>>>>>>> I don't know that there is much value in a 2.5. The
purpose is to be a step
>>>>>>>>>> toward 3.0, and releasing that step after 3.0 doesn't
seem helpful to me.
>>>>>>>>>> It also wouldn't get these features out any sooner
than 3.0, as a 2.5
>>>>>>>>>> release probably would, given the work needed to
validate the incompatible
>>>>>>>>>> changes in 3.0.
>>>>>>>>>>
>>>>>>>>>> > DSv2 change would be the major backward incompatibility
which
>>>>>>>>>> Spark 2.x users may hesitate to upgrade
>>>>>>>>>>
>>>>>>>>>> As I pointed out, DSv2 has been changing in the 2.x
line, so this
>>>>>>>>>> is expected. I don't think it will need incompatible
changes in the 3.x
>>>>>>>>>> line.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 20, 2019 at 9:25 PM Jungtaek Lim <kabhwan@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Just 2 cents, I haven't tracked the change of
DSv2 (though I
>>>>>>>>>>> needed to deal with this as the change made confusion
on my PRs...), but my
>>>>>>>>>>> bet is that DSv2 would be already changed in
incompatible way, at least who
>>>>>>>>>>> works for custom DataSource. Making downstream
to diverge their
>>>>>>>>>>> implementation heavily between minor versions
(say, 2.4 vs 2.5) wouldn't be
>>>>>>>>>>> a good experience - especially we are not completely
closed the chance
>>>>>>>>>>> to further modify DSv2, and the change could
be backward incompatible.
>>>>>>>>>>>
>>>>>>>>>>> If we really want to bring the DSv2 change to
2.x version line
>>>>>>>>>>> to let end users avoid forcing to upgrade Spark
3.x to enjoy new DSv2, I'd
>>>>>>>>>>> rather say preparation of Spark 2.5 should be
started after Spark 3.0 is
>>>>>>>>>>> officially released, honestly even later than
that, say, getting some
>>>>>>>>>>> reports from Spark 3.0 about DSv2 so that we
feel DSv2 is OK. I hope we
>>>>>>>>>>> don't make Spark 2.5 be a kind of "tech-preview"
which Spark 2.4 users may
>>>>>>>>>>> be frustrated to upgrade to next minor version.
>>>>>>>>>>>
>>>>>>>>>>> Btw, do we have any specific target users for
this? Personally
>>>>>>>>>>> DSv2 change would be the major backward incompatibility
which Spark 2.x
>>>>>>>>>>> users may hesitate to upgrade, so they might
be already prepared to migrate
>>>>>>>>>>> to Spark 3.0 if they are prepared to migrate
to new DSv2.
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Sep 21, 2019 at 12:46 PM Dongjoon Hyun
<
>>>>>>>>>>> dongjoon.hyun@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Do you mean you want to have a breaking API
change between 3.0
>>>>>>>>>>>> and 3.1?
>>>>>>>>>>>> I believe we follow Semantic Versioning (
>>>>>>>>>>>> https://spark.apache.org/versioning-policy.html
).
>>>>>>>>>>>>
>>>>>>>>>>>> > We just won’t add any breaking changes
before 3.1.
>>>>>>>>>>>>
>>>>>>>>>>>> Bests,
>>>>>>>>>>>> Dongjoon.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:48 AM Ryan Blue
<
>>>>>>>>>>>> rblue@netflix.com.invalid> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I don’t think we need to gate a 3.0
release on making a more
>>>>>>>>>>>>> stable version of InternalRow
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds like we agree, then. We will use
it for 3.0, but there
>>>>>>>>>>>>> are known problems with it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thinking we’d have dsv2 working in
both 3.x (which will change
>>>>>>>>>>>>> and progress towards more stable, but
will have to break certain APIs) and
>>>>>>>>>>>>> 2.x seems like a false premise.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Why do you think we will need to break
certain APIs before 3.0?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I’m only suggesting that we release
the same support in a 2.5
>>>>>>>>>>>>> release that we do in 3.0. Since we are
nearly finished with the 3.0 goals,
>>>>>>>>>>>>> it seems like we can certainly do that.
We just won’t add any breaking
>>>>>>>>>>>>> changes before 3.1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:39 AM Reynold
Xin <
>>>>>>>>>>>>> rxin@databricks.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't think we need to gate a 3.0
release on making a more
>>>>>>>>>>>>>> stable version of InternalRow, but
thinking we'd have dsv2 working in both
>>>>>>>>>>>>>> 3.x (which will change and progress
towards more stable, but will have to
>>>>>>>>>>>>>> break certain APIs) and 2.x seems
like a false premise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To point out some problems with InternalRow
that you think
>>>>>>>>>>>>>> are already pragmatic and stable:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The class is in catalyst, which states:
>>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/package.scala
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /**
>>>>>>>>>>>>>> * Catalyst is a library for manipulating
relational query
>>>>>>>>>>>>>> plans.  All classes in catalyst are
>>>>>>>>>>>>>> * considered an internal API to Spark
SQL and are subject to
>>>>>>>>>>>>>> change between minor releases.
>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is no even any annotation on
the interface.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The entire dependency chain were
created to be private, and
>>>>>>>>>>>>>> tightly coupled with internal implementations.
For example,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /**
>>>>>>>>>>>>>> * A UTF-8 String for internal Spark
use.
>>>>>>>>>>>>>> * <p>
>>>>>>>>>>>>>> * A String encoded in UTF-8 as an
Array[Byte], which can be
>>>>>>>>>>>>>> used for comparison,
>>>>>>>>>>>>>> * search, see http://en.wikipedia.org/wiki/UTF-8
for details.
>>>>>>>>>>>>>> * <p>
>>>>>>>>>>>>>> * Note: This is not designed for
general use cases, should
>>>>>>>>>>>>>> not be used outside SQL.
>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (which again is in catalyst package)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you want to argue this way, you
might as well argue we
>>>>>>>>>>>>>> should make the entire catalyst package
public to be pragmatic and not
>>>>>>>>>>>>>> allow any changes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:32 AM,
Ryan Blue <
>>>>>>>>>>>>>> rblue@netflix.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you created the PR to make
InternalRow public
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This isn’t quite accurate.
The change I made was to use
>>>>>>>>>>>>>>> InternalRow instead of UnsafeRow,
which is a specific
>>>>>>>>>>>>>>> implementation of InternalRow.
Exposing this API has always
>>>>>>>>>>>>>>> been a part of DSv2 and while
both you and I did some work to avoid this,
>>>>>>>>>>>>>>> we are still in the phase of
starting with that API.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Note that any change to InternalRow
would be very costly to
>>>>>>>>>>>>>>> implement because this interface
is widely used. That is why I think we can
>>>>>>>>>>>>>>> certainly consider it stable
enough to use here, and that’s probably why
>>>>>>>>>>>>>>> UnsafeRow was part of the original
proposal.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In any case, the goal for 3.0
was not to replace the use of
>>>>>>>>>>>>>>> InternalRow, it was to get the
majority of SQL working on
>>>>>>>>>>>>>>> top of the interface added after
2.4. That’s done and stable, so I think a
>>>>>>>>>>>>>>> 2.5 release with it is also reasonable.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:23
AM Reynold Xin <
>>>>>>>>>>>>>>> rxin@databricks.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To push back, while I agree
we should not drastically
>>>>>>>>>>>>>>>> change "InternalRow", there
are a lot of changes that need to happen to
>>>>>>>>>>>>>>>> make it stable. For example,
none of the publicly exposed interfaces should
>>>>>>>>>>>>>>>> be in the Catalyst package
or the unsafe package. External implementations
>>>>>>>>>>>>>>>> should be decoupled from
the internal implementations, with cheap ways to
>>>>>>>>>>>>>>>> convert back and forth.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When you created the PR to
make InternalRow public, the
>>>>>>>>>>>>>>>> understanding was to work
towards making it stable in the future, assuming
>>>>>>>>>>>>>>>> we will start with an unstable
API temporarily. You can't just make a bunch
>>>>>>>>>>>>>>>> internal APIs tightly coupled
with other internal pieces public and stable
>>>>>>>>>>>>>>>> and call it a day, just because
it happen to satisfy some use cases
>>>>>>>>>>>>>>>> temporarily assuming the
rest of Spark doesn't change.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:19
AM, Ryan Blue <
>>>>>>>>>>>>>>>> rblue@netflix.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > DSv2 is far from
stable right?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> No, I think it is reasonably
stable and very close to
>>>>>>>>>>>>>>>>> being ready for a release.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > All the actual data
types are unstable and you guys have
>>>>>>>>>>>>>>>>> completely ignored that.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think what you're referring
to is the use of
>>>>>>>>>>>>>>>>> `InternalRow`. That's
a stable API and there has been no work to avoid
>>>>>>>>>>>>>>>>> using it. In any case,
I don't think that anyone is suggesting that we
>>>>>>>>>>>>>>>>> delay 3.0 until a replacement
for `InternalRow` is added, right?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> While I understand the
motivation for a better solution
>>>>>>>>>>>>>>>>> here, I think the pragmatic
solution is to continue using `InternalRow`.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > If the goal is to
make DSv2 work across 3.x and 2.x,
>>>>>>>>>>>>>>>>> that seems too invasive
of a change to backport once you consider the parts
>>>>>>>>>>>>>>>>> needed to make dsv2 stable.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I believe that those
of us working on DSv2 are confident
>>>>>>>>>>>>>>>>> about the current stability.
We set goals for what to get into the 3.0
>>>>>>>>>>>>>>>>> release months ago and
have very nearly reached the point where we are
>>>>>>>>>>>>>>>>> ready for that release.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I don't think instability
would be a problem in
>>>>>>>>>>>>>>>>> maintaining compatibility
between the 2.5 version and the 3.0 version. If
>>>>>>>>>>>>>>>>> we find that we need
to make API changes (other than additions) then we can
>>>>>>>>>>>>>>>>> make those in the 3.1
release. Because the goals we set for the 3.0 release
>>>>>>>>>>>>>>>>> have been reached with
the current API and if we are ready to release 3.0,
>>>>>>>>>>>>>>>>> we can release a 2.5
with the same API.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Sep 20, 2019
at 11:05 AM Reynold Xin <
>>>>>>>>>>>>>>>>> rxin@databricks.com>
wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> DSv2 is far from
stable right? All the actual data types
>>>>>>>>>>>>>>>>>> are unstable and
you guys have completely ignored that. We'd need to work
>>>>>>>>>>>>>>>>>> on that and that
will be a breaking change. If the goal is to make DSv2
>>>>>>>>>>>>>>>>>> work across 3.x and
2.x, that seems too invasive of a change to backport
>>>>>>>>>>>>>>>>>> once you consider
the parts needed to make dsv2 stable.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Sep 20, 2019
at 10:47 AM, Ryan Blue <
>>>>>>>>>>>>>>>>>> rblue@netflix.com.invalid>
wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In the DSv2 sync
this week, we talked about a possible
>>>>>>>>>>>>>>>>>>> Spark 2.5 release
based on the latest Spark 2.4, but with DSv2 and Java 11
>>>>>>>>>>>>>>>>>>> support added.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> A Spark 2.5 release
with these two additions will help
>>>>>>>>>>>>>>>>>>> people migrate
to Spark 3.0 when it is released because they will be able
>>>>>>>>>>>>>>>>>>> to use a single
implementation for DSv2 sources that works in both 2.5 and
>>>>>>>>>>>>>>>>>>> 3.0. Similarly,
upgrading to 3.0 won't also require also updating to Java
>>>>>>>>>>>>>>>>>>> 11 because users
could update to Java 11 with the 2.5 release and have
>>>>>>>>>>>>>>>>>>> fewer major changes.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Another reason
to consider a 2.5 release is that many
>>>>>>>>>>>>>>>>>>> people are interested
in a release with the latest DSv2 API and support for
>>>>>>>>>>>>>>>>>>> DSv2 SQL. I'm
already going to be backporting DSv2 support to the Spark 2.4
>>>>>>>>>>>>>>>>>>> line, so it makes
sense to share this work with the community.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This release
line would just consist of backports like
>>>>>>>>>>>>>>>>>>> DSv2 and Java
11 that assist compatibility, to keep the scope of the
>>>>>>>>>>>>>>>>>>> release small.
The purpose is to assist people moving to 3.0 and not
>>>>>>>>>>>>>>>>>>> distract from
the 3.0 release.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Would a Spark
2.5 release help anyone else? Are there
>>>>>>>>>>>>>>>>>>> any concerns
about this plan?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> rb
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Name : Jungtaek Lim
>>>>>>>>>>> Blog : http://medium.com/@heartsavior
>>>>>>>>>>> Twitter : http://twitter.com/heartsavior
>>>>>>>>>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ryan Blue
>>>>>>>>>> Software Engineer
>>>>>>>>>> Netflix
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ryan Blue
>>>>>>>>
>>>>>>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Mime
View raw message