spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luis Ángel Vicente Sánchez <langel.gro...@gmail.com>
Subject Re: using multiple dstreams together (spark streaming)
Date Wed, 16 Jul 2014 17:34:53 GMT
hum... maybe consuming all streams at the same time with an actor that
would act as a new DStream source... but this is just a random idea... I
don't really know if that would be a good idea or even possible.


2014-07-16 18:30 GMT+01:00 Walrus theCat <walrusthecat@gmail.com>:

> Yeah -- I tried the .union operation and it didn't work for that reason.
> Surely there has to be a way to do this, as I imagine this is a commonly
> desired goal in streaming applications?
>
>
> On Wed, Jul 16, 2014 at 10:10 AM, Luis Ángel Vicente Sánchez <
> langel.groups@gmail.com> wrote:
>
>> I'm joining several kafka dstreams using the join operation but you have
>> the limitation that the duration of the batch has to be same,i.e. 1 second
>> window for all dstreams... so it would not work for you.
>>
>>
>> 2014-07-16 18:08 GMT+01:00 Walrus theCat <walrusthecat@gmail.com>:
>>
>> Hi,
>>>
>>> My application has multiple dstreams on the same inputstream:
>>>
>>> dstream1 // 1 second window
>>> dstream2 // 2 second window
>>> dstream3 // 5 minute window
>>>
>>>
>>> I want to write logic that deals with all three windows (e.g. when the 1
>>> second window differs from the 2 second window by some delta ...)
>>>
>>> I've found some examples online (there's not much out there!), and I can
>>> only see people transforming a single dstream.  In conventional spark, we'd
>>> do this sort of thing with a cartesian on RDDs.
>>>
>>> How can I deal with multiple Dstreams at once?
>>>
>>> Thanks
>>>
>>
>>
>

Mime
View raw message