spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shane knapp <skn...@berkeley.edu>
Subject Re: [DISCUSS] Increasing minimum supported version of Pandas
Date Fri, 14 Jun 2019 17:10:38 GMT
ah, ok...  should we downgrade the testing env on jenkins then?  any
specific version?

shane, who is loathe (and i mean LOATHE) to touch python envs ;)

On Fri, Jun 14, 2019 at 10:08 AM Bryan Cutler <cutlerb@gmail.com> wrote:

> I should have stated this earlier, but when the user does something that
> requires Pandas, the minimum version is checked against what was imported
> and will raise an exception if it is a lower version. So I'm concerned that
> using 0.24.2 might be a little too new for users running older clusters. To
> give some release dates, 0.23.2 was released about a year ago, 0.24.0 in
> January and 0.24.2 in March.
>
> On Fri, Jun 14, 2019 at 9:27 AM shane knapp <sknapp@berkeley.edu> wrote:
>
>> just to everyone knows, our python 3.6 testing infra is currently on
>> 0.24.2...
>>
>> On Fri, Jun 14, 2019 at 9:16 AM Dongjoon Hyun <dongjoon.hyun@gmail.com>
>> wrote:
>>
>>> +1
>>>
>>> Thank you for this effort, Bryan!
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Fri, Jun 14, 2019 at 4:24 AM Holden Karau <holden@pigscanfly.ca>
>>> wrote:
>>>
>>>> I’m +1 for upgrading, although since this is probably the last easy
>>>> chance we’ll have to bump version numbers easily I’d suggest 0.24.2
>>>>
>>>>
>>>> On Fri, Jun 14, 2019 at 4:38 AM Hyukjin Kwon <gurwls223@gmail.com>
>>>> wrote:
>>>>
>>>>> I am +1 to go for 0.23.2 - it brings some overhead to test PyArrow and
>>>>> pandas combinations. Spark 3 should be good time to increase.
>>>>>
>>>>> 2019년 6월 14일 (금) 오전 9:46, Bryan Cutler <cutlerb@gmail.com>님이
작성:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> We would like to discuss increasing the minimum supported version
of
>>>>>> Pandas in Spark, which is currently 0.19.2.
>>>>>>
>>>>>> Pandas 0.19.2 was released nearly 3 years ago and there are some
>>>>>> workarounds in PySpark that could be removed if such an old version
is not
>>>>>> required. This will help to keep code clean and reduce maintenance
effort.
>>>>>>
>>>>>> The change is targeted for Spark 3.0.0 release, see
>>>>>> https://issues.apache.org/jira/browse/SPARK-28041. The current
>>>>>> thought is to bump the version to 0.23.2, but we would like to discuss
>>>>>> before making a change. Does anyone else have thoughts on this?
>>>>>>
>>>>>> Regards,
>>>>>> Bryan
>>>>>>
>>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Mime
View raw message