spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haopu Wang" <>
Subject number of "Cached Partitions" v.s. "Total Partitions"
Date Tue, 22 Jul 2014 07:09:18 GMT
Hi, I'm using local mode and read a text file as RDD using
JavaSparkContext.textFile() API.

And then call "cache()" method on the result RDD.


I look at the Storage information and find the RDD has 3 partitions but
2 of them have been cached.

Is this a normal behavior? I assume all of partitions should be cached
or none of them.

If I'm wrong, what are the cases when number of "cached" partitions is
less than the total number of partitions?



View raw message