spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianshi Huang <jianshi.hu...@gmail.com>
Subject High GC time during shuffle read?
Date Tue, 17 Feb 2015 04:35:02 GMT
Hi,

My inputs are about 72GB. I did a group by first then a repartition.
However the repartition step is very slow.

[image: Inline image 1]


​
​
Any ideas why GC takes 80% of the time?


Cheers,
-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Mime
View raw message