spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: Accumulator and Accumulable vs classic MR
Date Fri, 01 Aug 2014 22:44:51 GMT
Only blocker is accumulator can be only "added" to from slaves & only read
on the master. If that constraint fit you well you can fire away.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Fri, Aug 1, 2014 at 7:38 AM, Julien Naour <julnaour@gmail.com> wrote:

> Hi,
>
> My question is simple: could it be some performance issue using
> Accumulable/Accumulator instead of method like map() reduce()... ?
>
> My use case : implementation of a clustering algorithm like k-means.
> At the begining I used two steps, one to asign data to cluster and another
> to calculate new centroids.
> After some research I use now an accumulable with an Array to calculate
> new centroid during the assigment of data. It's easier to unterstand and
> for the moment it gives better performance.
> It's probably because I used 2 steps before and now only one thanks to
> accumulable.
>
> So any indications against it ?
>
> Cheers,
>
> Julien
>
>

Mime
View raw message