spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu (Simon) Chen" <xche...@gmail.com>
Subject compress in-memory cache?
Date Thu, 05 Jun 2014 14:41:34 GMT
I have a working set larger than available memory, thus I am hoping to turn
on rdd compression so that I can store more in-memory. Strangely it made no
difference. The number of cached partitions, fraction cached, and size in
memory remain the same. Any ideas?

I confirmed that rdd compression wasn't on before and it was on for the
second test.

scala> sc.getConf.getAll foreach println
...
(spark.rdd.compress,true)
...

I haven't tried lzo vs snappy, but my guess is that either one should
provide at least some benefit..

Thanks.
-Simon

Mime
View raw message