spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Erlandson <eerla...@redhat.com>
Subject Re: Python friendly API for Spark 3.0
Date Sat, 15 Sep 2018 17:32:56 GMT
I am probably splitting hairs to finely, but I was considering the
difference between improvements to the jvm-side (py4j and the scala/java
code) that would make it easier to write the python layer ("python-friendly
api"), and actual improvements to the python layers ("friendly python api").

They're not mutually exclusive of course, and both worth working on. But
it's *possible* to improve either without the other.

Stub files look like a great solution for type annotations, maybe even if
only python 3 is supported.

I definitely agree that any decision to drop python 2 should not be taken
lightly. Anecdotally, I'm seeing an increase in python developers
announcing that they are dropping support for python 2 (and loving it). As
people have already pointed out, if we don't drop python 2 for spark 3.0,
we're stuck with it until 4.0, which would place spark in a
possibly-awkward position of supporting python 2 for some time after it
goes EOL.

Under the current release cadence, spark 3.0 will land some time in early
2019, which at that point will be mere months until EOL for py2.

On Fri, Sep 14, 2018 at 5:01 PM, Holden Karau <holden@pigscanfly.ca> wrote:

>
>
> On Fri, Sep 14, 2018, 3:26 PM Erik Erlandson <eerlands@redhat.com> wrote:
>
>> To be clear, is this about "python-friendly API" or "friendly python API"
>> ?
>>
> Well what would you consider to be different between those two statements?
> I think it would be good to be a bit more explicit, but I don't think we
> should necessarily limit ourselves.
>
>>
>> On the python side, it might be nice to take advantage of static typing.
>> Requires python 3.6 but with python 2 going EOL, a spark-3.0 might be a
>> good opportunity to jump the python-3-only train.
>>
> I think we can make types sort of work without ditching 2 (the types only
> would work in 3 but it would still function in 2). Ditching 2 entirely
> would be a big thing to consider, I honestly hadn't been considering that
> but it could be from just spending so much time maintaining a 2/3 code
> base. I'd suggest reaching out to to user@ before making that kind of
> change.
>
>>
>> On Fri, Sep 14, 2018 at 12:15 PM, Holden Karau <holden@pigscanfly.ca>
>> wrote:
>>
>>> Since we're talking about Spark 3.0 in the near future (and since some
>>> recent conversation on a proposed change reminded me) I wanted to open up
>>> the floor and see if folks have any ideas on how we could make a more
>>> Python friendly API for 3.0? I'm planning on taking some time to look at
>>> other systems in the solution space and see what we might want to learn
>>> from them but I'd love to hear what other folks are thinking too.
>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/
>>> 2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>>

Mime
View raw message