spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tobias Pfeiffer <...@preferred.jp>
Subject Re: RDD Moving Average
Date Fri, 09 Jan 2015 06:51:21 GMT
Hi,

On Wed, Jan 7, 2015 at 9:47 AM, Asim Jalis <asimjalis@gmail.com> wrote:

> One approach I was considering was to use mapPartitions. It is
> straightforward to compute the moving average over a partition, except for
> near the end point. Does anyone see how to fix that?
>

Well, I guess this is not a perfect use case for mapPartitions, in
particular since you would have to implement the behavior near the
beginning and end of a partition yourself. I would rather go with the
high-level RDD functions that are partition-independent.

By the way, I am now also trying to implement sliding windows based on
count and embedded timestamp... seems like I should have had a look at
rdd.sliding() before...

Tobias

Mime
View raw message