spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Spark Streaming RDD transformation
Date Thu, 26 Jun 2014 19:26:44 GMT
If you want to transform an RDD to a Map, I assume you have an RDD of
pairs. The method collectAsMap() creates a Map from the RDD in this
case.

Do you mean that you want to update a Map object using data in each
RDD? You would use foreachRDD() in that case. Then you can use
RDD.foreach to do something like update a global Map object.

Not sure if this is what you mean but SparkContext.parallelize() can
be used to make an RDD from a List or Array of objects. But that's not
really related to streaming or updating a Map.

On Thu, Jun 26, 2014 at 1:40 PM, Bill Jay <bill.jaypeterson@gmail.com> wrote:
> Hi all,
>
> I am current working on a project that requires to transform each RDD in a
> DStream to a Map. Basically, when we get a list of data in each batch, we
> would like to update the global map. I would like to return the map as a
> single RDD.
>
> I am currently trying to use the function transform. The output will be a
> RDD of the updated map after each batch. How can I create an RDD from
> another data structure such as Int, Map, ect. Thanks!
>
> Bill

Mime
View raw message