spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: [DISCUSS] Increasing minimum supported version of Pandas
Date Fri, 14 Jun 2019 11:23:30 GMT
I’m +1 for upgrading, although since this is probably the last easy chance
we’ll have to bump version numbers easily I’d suggest 0.24.2


On Fri, Jun 14, 2019 at 4:38 AM Hyukjin Kwon <gurwls223@gmail.com> wrote:

> I am +1 to go for 0.23.2 - it brings some overhead to test PyArrow and
> pandas combinations. Spark 3 should be good time to increase.
>
> 2019년 6월 14일 (금) 오전 9:46, Bryan Cutler <cutlerb@gmail.com>님이 작성:
>
>> Hi All,
>>
>> We would like to discuss increasing the minimum supported version of
>> Pandas in Spark, which is currently 0.19.2.
>>
>> Pandas 0.19.2 was released nearly 3 years ago and there are some
>> workarounds in PySpark that could be removed if such an old version is not
>> required. This will help to keep code clean and reduce maintenance effort.
>>
>> The change is targeted for Spark 3.0.0 release, see
>> https://issues.apache.org/jira/browse/SPARK-28041. The current thought
>> is to bump the version to 0.23.2, but we would like to discuss before
>> making a change. Does anyone else have thoughts on this?
>>
>> Regards,
>> Bryan
>>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Mime
View raw message