spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Siegmann <dsiegm...@securityscorecard.io>
Subject Re: [Spark] Accumulators or count()
Date Wed, 01 Mar 2017 15:30:06 GMT
As you noted, Accumulators do not guarantee accurate results except in
specific situations. I recommend never using them.

This article goes into some detail on the problems with accumulators:
http://imranrashid.com/posts/Spark-Accumulators/


On Wed, Mar 1, 2017 at 7:26 AM, Charles O. Bajomo <
charles.bajomo@pretechconsulting.co.uk> wrote:

> Hello everyone,
>
> I wanted to know if there is any benefit to using an acculumator over just
> executing a count() on the whole RDD. There seems to be a lot of issues
> with accumulator during a stage failure and also seems to be an issue
> rebuilding them if the application restarts from a checkpoint. Anyone have
> any suggestions no this?
>
> Thanks
>

Mime
View raw message