spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <tathagata.das1...@gmail.com>
Subject Re: streaming window not behaving as advertised (v1.0.1)
Date Wed, 23 Jul 2014 05:13:10 GMT
It could be related to this bug that is currently open.
https://issues.apache.org/jira/browse/SPARK-1312

Here is a workaround. Can you put a inputStream.foreachRDD(rdd => { }) and
try these combos again?

TD


On Tue, Jul 22, 2014 at 6:01 PM, Alan Ngai <alan@opsclarity.com> wrote:

> I have a sample application pumping out records 1 per second.  The batch
> interval is set to 5 seconds.  Here’s a list of “observed window intervals”
> vs what was actually set
>
> window=25, slide=25 : observed-window=25, overlapped-batches=0
> window=25, slide=20 : observed-window=20, overlapped-batches=0
> window=25, slide=15 : observed-window=15, overlapped-batches=0
> window=25, slide=10 : observed-window=20, overlapped-batches=2
> window=25, slide=5 : observed-window=25, overlapped-batches=3
>
> can someone explain this behavior to me?  I’m trying to aggregate metrics
> by time batches, but want to skip partial batches.  Therefore, I’m trying
> to find a combination which results in 1 overlapped batch, but no
> combination I tried gets me there.
>
> Alan
>
>

Mime
View raw message