spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianshi Huang <jianshi.hu...@gmail.com>
Subject Re: High GC time during shuffle read?
Date Tue, 17 Feb 2015 05:03:14 GMT
Here's the final summary for the repartition step.

[image: Inline image 1]



On Tue, Feb 17, 2015 at 12:38 PM, Jianshi Huang <jianshi.huang@gmail.com>
wrote:

> BTW, I'm using 1.2.2 built from branch-1.2 and my settings are:
>
> - executor mem: 4G
> - num of executors: 100
> - num of executor cores: 2
> - mode: yarn-client
>
> Jianshi
>
> On Tue, Feb 17, 2015 at 12:35 PM, Jianshi Huang <jianshi.huang@gmail.com>
> wrote:
>
>> Hi,
>>
>> My inputs are about 72GB. I did a group by first then a repartition.
>> However the repartition step is very slow.
>>
>> [image: Inline image 1]
>>
>>
>> ​
>> ​
>> Any ideas why GC takes 80% of the time?
>>
>>
>> Cheers,
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Mime
View raw message