spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <>
Subject Re: RDD Moving Average
Date Tue, 06 Jan 2015 20:45:19 GMT
Asim Jalis <> writes:

> ​Thanks. Another question. ​I have event data with timestamps. I want to create a
sliding window
> using timestamps. Some windows will have a lot of events in them others won’t. Is there
a way
> to get an RDD made of this kind of a variable length window?
You should consider map()ing to (K,V) Tuple2's where K identifies the timestamp number (e.g.
if you want 5-minute windows, then it could be the timestamp rounded down to the nearest 5-minute
start point). Then you can use reduceByKey() to aggregate on a per-window basis.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message