spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yadid Ayzenberg <ya...@media.mit.edu>
Subject JavaPairRDD mapPartitions side effects
Date Mon, 09 Dec 2013 19:06:30 GMT

Hi all,

Im noticing some strange behavior when running mapPartitions. Pseudo code:

JavaPairRDD<Object, Tuple2<Object, BSONObject>> myRDD = 
myRDD.mapPartitions( func )

myRDD.count()

ArrayList<Tuple2<Integer, Tuple2<List<Tuple2<Double, Double>>, 
List<Tuple2<Double, Double>>>>>tempRDD = myRDD.mapPartitions(func2 )

tempRDD.count()


JavaPairRDD<Object, Tuple2<Object, BSONObject>> myRDD = 
myRDD.mapPartitions( func )


It seems that mapPartitions has side-effects. When I try running the 
last line - its seems that contents of myRDD have been changed by the 
previous map. I thought the RDD were immutable and that It was only 
possible to generate new RDDs using map. Is this incorrect?


Thanks,
Yadid


Mime
View raw message