spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reddy Raja <areddyr...@gmail.com>
Subject Spark Streaming
Date Wed, 24 Sep 2014 10:35:23 GMT
Given this program.. I have the following queries..

val sparkConf = new SparkConf().setAppName("NetworkWordCount")

    sparkConf.set("spark.master", "local[2]")

    val ssc = new StreamingContext(sparkConf, Seconds(10))

    val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.
MEMORY_AND_DISK_SER)

    val words = lines.flatMap(_.split(" "))

    val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)

    wordCounts.print()

    ssc.start()

    ssc.awaitTermination()


Q1) How do I know which part of the program is executing every 10 sec..

   My requirements is that, I want to execute a method and insert data into
Cassandra every time a set of messages comes in

Q2) Is there a function I can pass, so that, it gets executed when the next
set of messages comes in.

Q3) If I have a method in-beween the following lines

  val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)

    wordCounts.print()

    my_method(stread rdd)..

    ssc.start()


   The method is not getting executed..


Can some one answer these questions?

--Reddy

Mime
View raw message