spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <>
Subject Re: What should happen if we try to cache more data than the cluster can hold in memory?
Date Mon, 04 Aug 2014 08:10:11 GMT
It seems possible that you are running out of memory unrolling a single
partition of the RDD. This is something that can cause your executor to
OOM, especially if the cache is close to being full so the executor doesn't
have much free memory left. How large are your executors? At the time of
failure, is the cache already nearly full?

I also believe the Snappy compression codec in Hadoop is not splittable.
This means that each of your JSON files is read in its entirety as one
spark partition. If you have files that are larger than the standard block
size (128MB), it will exacerbate this shortcoming of Spark. Incidentally,
this means minPartitions won't help you at all here.

This is fixed in the master branch and will be fixed in Spark 1.1. As a
debugging step (if this is doable), it's worth running this job on the
master branch and seeing if it succeeds.

A (potential) workaround would be to first persist your data to disk, then
re-partition it, then cache it. I'm not 100% sure whether that will work

 val a = sc.textFile("s3n://some-path/*.json").persist(DISK_ONLY).repartition(larger
nr of partitions).cache()

- Patrick

On Fri, Aug 1, 2014 at 10:17 AM, Nicholas Chammas <> wrote:

> On Fri, Aug 1, 2014 at 12:39 PM, Sean Owen <> wrote:
> Isn't this your worker running out of its memory for computations,
>> rather than for caching RDDs?
> I'm not sure how to interpret the stack trace, but let's say that's true.
> I'm even seeing this with a simple a = sc.textFile().cache() and then
> a.count(). Spark shouldn't need that much memory for this kind of work,
> no?
> then the answer is that you should tell
>> it to use less memory for caching.
> I can try that. That's done by changing,
> right?
> This still seems strange though. The default fraction of the JVM left for
> non-cache activity (1 - 0.6 = 40%
> <>)
> should be plenty for just counting elements. I'm using m1.xlarge nodes
> that have 15GB of memory apiece.
> Nick

View raw message