spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Date Fri, 09 Nov 2018 16:58:13 GMT
Very cool!


________________________________
From: Hyukjin Kwon <gurwls223@gmail.com>
Sent: Thursday, November 8, 2018 10:29 AM
To: dev
Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame

Hi all,

I am trying to introduce R Arrow optimization by reusing PySpark Arrow optimization.

It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster.

Looks working fine so far; however, I would appreciate if you guys have some time to take
a look (https://github.com/apache/spark/pull/22954) so that we can directly go ahead as soon
as R API of Arrow is released.

More importantly, I want some more people who're more into Arrow R API side but also interested
in Spark side. I have already cc'ed some people I know but please come, review and discuss
for both Spark side and Arrow side.

Thanks.


Mime
View raw message