spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: NonSerializable Exception in foreachRDD
Date Fri, 31 Oct 2014 06:58:40 GMT
Are you expecting something like this?


val data = ssc.textFileStream("hdfs://akhldz:9000/input/")


     val rdd = ssc.sparkContext.parallelize(Seq("foo", "bar"))

     val sample = data.foreachRDD(x=> {
       val new_rdd = x.union(rdd)
       new_rdd.saveAsTextFile("hdfs://akhldz:9000/output/")
     })

Thanks
Best Regards

On Fri, Oct 31, 2014 at 10:46 AM, Tobias Pfeiffer <tgp@preferred.jp> wrote:

> Harold,
>
> just mentioning it in case you run into it: If you are in a separate
> thread, there are apparently stricter limits to what you can and cannot
> serialize:
>
> val someVal
> future {
>   // be very careful with defining RDD operations using someVal here
>   val myLocalVal = someVal
>   // use myLocalVal instead
> }
>
> On Thu, Oct 30, 2014 at 4:55 PM, Harold Nguyen <harold@nexgate.com> wrote:
>
>> In Spark Streaming, when I do "foreachRDD" on my DStreams, I get a
>> NonSerializable exception when I try to do something like:
>>
>> DStream.foreachRDD( rdd => {
>>   var sc.parallelize(Seq(("test", "blah")))
>> })
>>
>
> Is this the code you are actually using? "var sc.parallelize(...)" doesn't
> really look like valid Scala to me.
>
> Tobias
>
>
>

Mime
View raw message