spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jay vyas <jayunit100.apa...@gmail.com>
Subject Re: real-time streaming
Date Tue, 28 Oct 2014 17:49:57 GMT
a REAL TIME stream, by definition, delivers data every X seconds.  you can
easily do this with spark. roughly here is the way to create a stream
gobbler and attach a spark app to read its data every X seconds....

- Write a Runnable thread which reads data from a source.  Test that it
works independently.

- Add that thread into a DStream Handler, and implement onStart() such that
the thread above is launched in the onStart(), andadd logic to onStop() to
safely destroy the above thread.

- Set the window time (i.e. to 5 seconds)

- Start your spark streaming context, and run a forEachRDD (...) in your
spark app.

- MAke sure that you launch with 2 or more workers.



On Tue, Oct 28, 2014 at 1:44 PM, ll <duy.huynh.uiv@gmail.com> wrote:

> the spark tutorial shows that we can create a stream that reads "new files"
> from a directory.
>
> that seems to have some lag time, as we have to write the data to file
> first
> and then wait until spark stream picks it up.
>
> what is the best way to implement REAL 'REAL-TIME' streaming for analysis
> in
> real time?  for example, like streaming videos, sounds, images, etc
> continuously?
>
> thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/real-time-streaming-tp17526.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 
jay vyas

Mime
View raw message