spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghavendra Pandey <>
Subject Re: Spark app performance
Date Fri, 02 Jan 2015 02:13:54 GMT
I have seen that link. I am using RDD of Byte Array n Kryo serialization.
Inside mapPartition when I measure time it is never more than 1 ms whereas
total time took by application is like 30 min. Codebase has lot of
dependencies. I m trying to come up with a simple version where I can
reproduce this problem.
Also GC timings reported by spark ui is always in the range of 3~4%of total

On Thu, Jan 1, 2015, 14:05 Akhil Das <> wrote:

> Would be great if you can share the piece of code happening inside your
> mapPartition, I'm assuming you are creating/handling a lot of Complex
> objects and hence it slows down the performance. Here's a link
> <> to performance tuning
> if you haven't seen it already.
> Thanks
> Best Regards
> On Wed, Dec 31, 2014 at 8:45 AM, Raghavendra Pandey <
>> wrote:
>> I have a spark app that involves series of mapPartition operations and
>> then a keyBy operation. I have measured the time inside mapPartition
>> function block. These blocks take trivial time. Still the application takes
>> way too much time and even sparkUI shows that much time.
>> So i was wondering where does it take time and how can I reduce this.
>> Thanks
>> Raghavendra

View raw message