spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Haberman <>
Subject Re: oome from blockmanager
Date Mon, 28 Oct 2013 21:31:40 GMT
Hey guys,

As a follow up, I raised our target partition size to 600mb (up from
64mb), which split this report's 500gb of tiny S3 files into ~700
partitions, and everything ran much smoother.

In retrospect, this was the same issue we'd ran into before, having too
many partitions, and had previously solved by throwing some guesses at
coalesce to make it magically go away.

But now I feel like we have a much better understanding of why the
numbers need to be what they are, which is great.

So, thanks for all the input and helping me understand what's going on.

It'd be great to see some of the optimizations to BlockManager happen,
but I understand in the end why it needs to track what it does. And I
was also admittedly using a small cluster anyway.

- Stephen

View raw message