spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Jaggi <>
Subject Re: ExternalAppendOnlyMap: Spilling in-memory map
Date Thu, 22 May 2014 21:02:02 GMT
I did not register anything explicitly based on the belief that the class
name is written out in full only once. I also wondered why that problem
would be specific to JodaTime and not show up with guess
it is possible based on internals of Joda time.
If I remove DateTime from my RDD, the problem goes away.
I will try explicit registration(and add DateTime back to my RDD) and see
if that makes things better.


On Wed, May 21, 2014 at 8:36 PM, Andrew Ash <> wrote:

> Hi Mohit,
> The log line about the ExternalAppendOnlyMap is more of a symptom of
> slowness than causing slowness itself.  The ExternalAppendOnlyMap is used
> when a shuffle is causing too much data to be held in memory.  Rather than
> OOM'ing, Spark writes the data out to disk in a sorted order and reads it
> back from disk later on when it's needed.  That's the job of the
> ExternalAppendOnlyMap.
> I wouldn't normally expect a conversion from Date to a Joda DateTime to
> take significantly more memory.  But since you're using Kryo and classes
> should be registered with it, may may have forgotten to register DateTime
> with Kryo.  If you don't register a class, it writes the class name at the
> beginning of every serialized instance, which for DateTime objects of size
> roughly 1 long, that's a ton of extra space and very inefficient.
> Can you confirm that DateTime is registered with Kryo?
> On Wed, May 21, 2014 at 2:35 PM, Mohit Jaggi <> wrote:
>> Hi,
>> I changed my application to use Joda time instead of java.util.Date and I
>> started getting this:
>> WARN ExternalAppendOnlyMap: Spilling in-memory map of 484 MB to disk (1
>> time so far)
>> What does this mean? How can I fix this? Due to this a small job takes
>> forever.
>> Mohit.
>> P.S.: I am using kyro serialization, have played around with several
>> values of sparkRddMemFraction

View raw message