spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YANG Fan <>
Subject spark fault tolerance mechanism
Date Thu, 15 Jan 2015 10:42:38 GMT

I'm quite interested in how Spark's fault tolerance works and I'd like to
ask a question here.

According to the paper, there are two kinds of dependencies--the wide
dependency and the narrow dependency. My understanding is, if the
operations I use are all "narrow", then when one machine crashes, the
system just need to recover the lost RDDs from the most recent checkpoint.
However, if all transformations are "wide"(e.g. in calculating PageRank),
then when one node crashes, all other nodes need to roll back to the most
recent checkpoint. Is my understanding correct?


Best Regards,

View raw message