spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fengdong Yu <fengdo...@everstring.com>
Subject Re: Accumulators internals and reliability
Date Mon, 26 Oct 2015 10:55:36 GMT

Hi Sela,

yes, Accumulator is reliable, you just use it like a local variable. you can get it value
in the Driver after the job complete. and add its value during the task action, such as RDD.map.





> On Oct 26, 2015, at 5:13 PM, Sela, Amit <ANSELA@paypal.com.INVALID> wrote:
> 
> It seems like there is not much literature about Spark's Accumulators so I thought I'd
ask here:
> 
> Do Accumulators reside in a Task ? Are they being serialized with the task ? Sent back
on task completion as part of the ResultTask ?
> 
> Are they reliable ? If so, when ? Can I relay on accumulators value only after the task
was successfully complete (meaning in the driver) ? Or also during the task execution as well
(what about speculative execution) ?
> 
> What are the limitations on the number (or size) of Accumulators ?
> 
> Thanks,
> Amit


Mime
View raw message