spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walrus theCat <walrusthe...@gmail.com>
Subject Re: using multiple dstreams together (spark streaming)
Date Wed, 16 Jul 2014 17:35:17 GMT
Or, if not, is there a way to do this in terms of a single dstream?  Keep
in mind that dstream1, dstream2, and dstream3 have already had
transformations applied.  I tried creating the dstreams by calling .window
on the first one, but that ends up with me having ... 3 dstreams... which
is the same problem.


On Wed, Jul 16, 2014 at 10:30 AM, Walrus theCat <walrusthecat@gmail.com>
wrote:

> Yeah -- I tried the .union operation and it didn't work for that reason.
> Surely there has to be a way to do this, as I imagine this is a commonly
> desired goal in streaming applications?
>
>
> On Wed, Jul 16, 2014 at 10:10 AM, Luis Ángel Vicente Sánchez <
> langel.groups@gmail.com> wrote:
>
>> I'm joining several kafka dstreams using the join operation but you have
>> the limitation that the duration of the batch has to be same,i.e. 1 second
>> window for all dstreams... so it would not work for you.
>>
>>
>> 2014-07-16 18:08 GMT+01:00 Walrus theCat <walrusthecat@gmail.com>:
>>
>> Hi,
>>>
>>> My application has multiple dstreams on the same inputstream:
>>>
>>> dstream1 // 1 second window
>>> dstream2 // 2 second window
>>> dstream3 // 5 minute window
>>>
>>>
>>> I want to write logic that deals with all three windows (e.g. when the 1
>>> second window differs from the 2 second window by some delta ...)
>>>
>>> I've found some examples online (there's not much out there!), and I can
>>> only see people transforming a single dstream.  In conventional spark, we'd
>>> do this sort of thing with a cartesian on RDDs.
>>>
>>> How can I deal with multiple Dstreams at once?
>>>
>>> Thanks
>>>
>>
>>
>

Mime
View raw message