spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Bryński <mac...@brynski.pl>
Subject Re: Performance of loading parquet files into case classes in Spark
Date Sat, 27 Aug 2016 20:32:40 GMT
2016-08-27 15:27 GMT+02:00 Julien Dumazert <julien.dumazert@gmail.com>:

> df.map(row => row.getAs[Long]("fieldToSum")).reduce(_ + _)


I think reduce and sum has very different performance.
Did you try sql.functions.sum ?
Or of you want to benchmark access to Row object then  count() function
will be better idea.

Regards,
-- 
Maciek Bryński

Mime
View raw message