spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongjoon Hyun <dongjoon.h...@gmail.com>
Subject Re: [DISCUSS] Increasing minimum supported version of Pandas
Date Fri, 14 Jun 2019 16:15:48 GMT
+1

Thank you for this effort, Bryan!

Bests,
Dongjoon.

On Fri, Jun 14, 2019 at 4:24 AM Holden Karau <holden@pigscanfly.ca> wrote:

> I’m +1 for upgrading, although since this is probably the last easy chance
> we’ll have to bump version numbers easily I’d suggest 0.24.2
>
>
> On Fri, Jun 14, 2019 at 4:38 AM Hyukjin Kwon <gurwls223@gmail.com> wrote:
>
>> I am +1 to go for 0.23.2 - it brings some overhead to test PyArrow and
>> pandas combinations. Spark 3 should be good time to increase.
>>
>> 2019년 6월 14일 (금) 오전 9:46, Bryan Cutler <cutlerb@gmail.com>님이
작성:
>>
>>> Hi All,
>>>
>>> We would like to discuss increasing the minimum supported version of
>>> Pandas in Spark, which is currently 0.19.2.
>>>
>>> Pandas 0.19.2 was released nearly 3 years ago and there are some
>>> workarounds in PySpark that could be removed if such an old version is not
>>> required. This will help to keep code clean and reduce maintenance effort.
>>>
>>> The change is targeted for Spark 3.0.0 release, see
>>> https://issues.apache.org/jira/browse/SPARK-28041. The current thought
>>> is to bump the version to 0.23.2, but we would like to discuss before
>>> making a change. Does anyone else have thoughts on this?
>>>
>>> Regards,
>>> Bryan
>>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Mime
View raw message