spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shivaram Venkataraman <shiva...@eecs.berkeley.edu>
Subject Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Date Fri, 09 Nov 2018 17:19:19 GMT
Thanks Hyukjin! Very cool results

Shivaram
On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung <felixcheung_m@hotmail.com> wrote:
>
> Very cool!
>
>
> ________________________________
> From: Hyukjin Kwon <gurwls223@gmail.com>
> Sent: Thursday, November 8, 2018 10:29 AM
> To: dev
> Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame
>
> Hi all,
>
> I am trying to introduce R Arrow optimization by reusing PySpark Arrow optimization.
>
> It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster.
>
> Looks working fine so far; however, I would appreciate if you guys have some time to
take a look (https://github.com/apache/spark/pull/22954) so that we can directly go ahead
as soon as R API of Arrow is released.
>
> More importantly, I want some more people who're more into Arrow R API side but also
interested in Spark side. I have already cc'ed some people I know but please come, review
and discuss for both Spark side and Arrow side.
>
> Thanks.
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message