spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sai Prasanna <ansaiprasa...@gmail.com>
Subject Re: GC overhead limit exceeded
Date Thu, 27 Mar 2014 18:57:18 GMT
No i am running on 0.8.1.
Yes i am caching a lot, i am benchmarking a simple code in spark where in
512mb, 1g and 2g text files are taken, some basic intermediate operations
are done while the intermediate result which will be used in subsequent
operations are cached.

I thought that, we need not manually unpersist, if i need to cache
something and if cache is found full, automatically space will be created
by evacuating the earlier. Do i need to unpersist?

Moreover, if i run several times, will the previously cached RDDs still
remain in the cache? If so can i flush them manually out before the next
run? [something like complete cache flush]


On Thu, Mar 27, 2014 at 11:16 PM, Andrew Or <andrew@databricks.com> wrote:

> Are you caching a lot of RDD's? If so, maybe you should unpersist() the
> ones that you're not using. Also, if you're on 0.9, make sure
> spark.shuffle.spill is enabled (which it is by default). This allows your
> application to spill in-memory content to disk if necessary.
>
> How much memory are you giving to your executors? The default,
> spark.executor.memory is 512m, which is quite low. Consider raising this.
> Checking the web UI is a good way to figure out your runtime memory usage.
>
>
> On Thu, Mar 27, 2014 at 9:22 AM, Ognen Duzlevski <
> ognen@plainvanillagames.com> wrote:
>
>>  Look at the tuning guide on Spark's webpage for strategies to cope with
>> this.
>> I have run into quite a few memory issues like these, some are resolved
>> by changing the StorageLevel strategy and employing things like Kryo, some
>> are solved by specifying the number of tasks to break down a given
>> operation into etc.
>>
>> Ognen
>>
>>
>> On 3/27/14, 10:21 AM, Sai Prasanna wrote:
>>
>> "java.lang.OutOfMemoryError: GC overhead limit exceeded"
>>
>>  What is the problem. The same code, i run, one instance it runs in 8
>> second, next time it takes really long time, say 300-500 seconds...
>> I see the logs a lot of GC overhead limit exceeded is seen. What should
>> be done ??
>>
>>  Please can someone throw some light on it ??
>>
>>
>>
>>  --
>>  *Sai Prasanna. AN*
>> *II M.Tech (CS), SSSIHL*
>>
>>
>> * Entire water in the ocean can never sink a ship, Unless it gets inside.
>> All the pressures of life can never hurt you, Unless you let them in.*
>>
>>
>>
>


-- 
*Sai Prasanna. AN*
*II M.Tech (CS), SSSIHL*


*Entire water in the ocean can never sink a ship, Unless it gets inside.All
the pressures of life can never hurt you, Unless you let them in.*

Mime
View raw message