spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michal Klos <michal.klo...@gmail.com>
Subject RDD resiliency -- does it keep state?
Date Fri, 27 Mar 2015 19:36:37 GMT
Hi Spark group,

We haven't been able to find clear descriptions of how Spark handles the
resiliency of RDDs in relationship to executing actions with side-effects.
If you do an `rdd.foreach(someSideEffect)`, then you are doing a
side-effect for each element in the RDD. If a partition goes down -- the
resiliency rebuilds the data,  but did it keep track of how far it go in
the partition's set of data or will it start from the beginning again. So
will it do at-least-once execution of foreach closures or at-most-once?

thanks,
Michal

Mime
View raw message