spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carol McDonald <cmcdon...@maprtech.com>
Subject correct use of DStream foreachRDD
Date Fri, 28 Aug 2015 14:29:31 GMT
I would like to make sure  that I am using the DStream  foreachRDD
operation correctly. I would like to read from a DStream transform the
input and write to HBase.  The code below works , but I became confused
when I read "Note that the function *func* is executed in the driver
process" ?


    val lines = ssc.textFileStream("/stream")

    lines.foreachRDD { rdd =>
      // parse the line of data into sensor object
      val sensorRDD = rdd.map(Sensor.parseSensor)

      // convert sensor data to put object and write to HBase table column
family data
      new PairRDDFunctions(sensorRDD.
                  map(Sensor.convertToPut)).
                  saveAsHadoopDataset(jobConfig)

    }

Mime
View raw message