spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <michaelma...@yahoo.com.INVALID>
Subject Re: RDD Moving Average
Date Tue, 06 Jan 2015 20:45:19 GMT
Asim Jalis <asimjalis@gmail.com> writes:
>

> ​Thanks. Another question. ​I have event data with timestamps. I want to create a
sliding window
> using timestamps. Some windows will have a lot of events in them others won’t. Is there
a way
> to get an RDD made of this kind of a variable length window?
You should consider map()ing to (K,V) Tuple2's where K identifies the timestamp number (e.g.
if you want 5-minute windows, then it could be the timestamp rounded down to the nearest 5-minute
start point). Then you can use reduceByKey() to aggregate on a per-window basis.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message