a REAL TIME stream, by definition, delivers data every X seconds.  you can easily do this with spark. roughly here is the way to create a stream gobbler and attach a spark app to read its data every X seconds....

- Write a Runnable thread which reads data from a source.  Test that it works independently.

- Add that thread into a DStream Handler, and implement onStart() such that the thread above is launched in the onStart(), andadd logic to onStop() to safely destroy the above thread.

- Set the window time (i.e. to 5 seconds)

- Start your spark streaming context, and run a forEachRDD (...) in your spark app.

- MAke sure that you launch with 2 or more workers.

On Tue, Oct 28, 2014 at 1:44 PM, ll <duy.huynh.uiv@gmail.com> wrote:
the spark tutorial shows that we can create a stream that reads "new files"
from a directory.

that seems to have some lag time, as we have to write the data to file first
and then wait until spark stream picks it up.

what is the best way to implement REAL 'REAL-TIME' streaming for analysis in
real time?  for example, like streaming videos, sounds, images, etc


View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/real-time-streaming-tp17526.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

jay vyas