spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jay vyas <>
Subject Re: real-time streaming
Date Tue, 28 Oct 2014 17:49:57 GMT
a REAL TIME stream, by definition, delivers data every X seconds.  you can
easily do this with spark. roughly here is the way to create a stream
gobbler and attach a spark app to read its data every X seconds....

- Write a Runnable thread which reads data from a source.  Test that it
works independently.

- Add that thread into a DStream Handler, and implement onStart() such that
the thread above is launched in the onStart(), andadd logic to onStop() to
safely destroy the above thread.

- Set the window time (i.e. to 5 seconds)

- Start your spark streaming context, and run a forEachRDD (...) in your
spark app.

- MAke sure that you launch with 2 or more workers.

On Tue, Oct 28, 2014 at 1:44 PM, ll <> wrote:

> the spark tutorial shows that we can create a stream that reads "new files"
> from a directory.
> that seems to have some lag time, as we have to write the data to file
> first
> and then wait until spark stream picks it up.
> what is the best way to implement REAL 'REAL-TIME' streaming for analysis
> in
> real time?  for example, like streaming videos, sounds, images, etc
> continuously?
> thanks!
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

jay vyas

View raw message