spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ankurdave <ankurd...@gmail.com>
Subject Re: Variables outside of mapPartitions scope
Date Fri, 09 May 2014 19:14:37 GMT
In general, you can find out exactly what's not serializable by adding
-Dsun.io.serialization.extendedDebugInfo=true to SPARK_JAVA_OPTS.
Since a this reference to the enclosing class is often what's causing the
problem, a general workaround is to move the mapPartitions call to a static
method where there is no this reference. This transforms this:
class A {  def f() = rdd.mapPartitions(iter => ...)}
into this:
class A {  def f() = A.helper(rdd)}object A {  def helper(rdd: RDD[...]) =
rdd.mapPartitions(iter => ...)}




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Variables-outside-of-mapPartitions-scope-tp5517p5527.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Mime
View raw message