spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaonary Rabarisoa <>
Subject N-Fold validation and RDD partitions
Date Fri, 21 Mar 2014 13:32:34 GMT

I need to partition my data represented as RDD into n folds and run metrics
computation in each fold and finally compute the means of my metrics
overall the folds.
Does spark can do the data partition out of the box or do I need to
implement it myself. I know that RDD has a partitions method and
mapPartitions but I really don't understand the purpose and the meaning of
partition here.



View raw message