SparkStreaming does not pick up old files by default, so you need to start your job with master=local[2] (It needs 2 or more working threads, 1 to read the files and the other to do your computation) and once the job start to run, place your input files in the input directories and you can see them being picked up by sparkstreaming.  

On Sun, Jun 19, 2016 at 12:37 AM, Biplob Biswas <> wrote:

I tried local[*] and local[2] and the result is the same. I don't really understand the problem here. 
How can I confirm that the files are read properly? 

Thanks & Regards
Biplob Biswas

On Sat, Jun 18, 2016 at 5:59 PM, Akhil Das <> wrote:
Looks like you need to set your master to local[2] or local[*]

On Sat, Jun 18, 2016 at 4:54 PM, Biplob Biswas <> wrote:

I implemented the streamingKmeans example provided in the spark website but
in Java.
The full implementation is here,

But i am not getting anything in the output except occasional timestamps
like one below:

Time: 1466176935000 ms

Also, i have 2 directories:
"D:\spark\streaming example\Data Sets\training"
"D:\spark\streaming example\Data Sets\test"

and inside these directories i have 1 file each "samplegpsdata_train.txt"
and "samplegpsdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.

I am very new to the spark systems and any help is highly appreciated.

Thank you so much
Biplob Biswas

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail: