spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harold Nguyen <har...@nexgate.com>
Subject Re: Manipulating RDDs within a DStream
Date Thu, 30 Oct 2014 16:59:37 GMT
Hi,

Sorry, there's a typo there:

val arr = rdd.toArray


Harold

On Thu, Oct 30, 2014 at 9:58 AM, Harold Nguyen <harold@nexgate.com> wrote:

> Hi all,
>
> I'd like to be able to modify values in a DStream, and then send it off to
> an external source like Cassandra, but I keep getting Serialization errors
> and am not sure how to use the correct design pattern. I was wondering if
> you could help me.
>
> I'd like to be able to do the following:
>
>  wordCounts.foreachRDD( rdd => {
>
>        val arr = record.toArray
>        ...
>
> })
>
> I would like to use the "arr" to send back to cassandra, for instance:
>
> Use it like this:
>
> val collection = sc.parallelize(Seq(a.head._1, a.head_.2))
> collection.saveToCassandra(....)
>
> Or something like that, but as you know, I can't do this within the
> "foreacRDD" but only at the driver level. How do I use the "arr" variable
> to do something like that ?
>
> Thanks for any help,
>
> Harold
>
>

Mime
View raw message