spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yong Zhang <>
Subject Re: org.apache.spark.SparkException: Task not serializable
Date Mon, 13 Mar 2017 13:26:17 GMT
In fact, I will suggest different way to handle the originally problem.

The example listed originally comes with a Java Function doesn't use any instance fields/methods,
so serializing the whole class is a overkill solution.

Instead, you can/should make the Function static, which will work in the logic of that function
tries to do, and it is a better solution than marking the whole class serializable.

The whole issue is that the function is not static, but doesn't use any instance fields or
other methods. But Spark sends the non-static function call, it has to wrapper the whole class
which contains the function as a whole closure through network, and in this case, it requires
the whole class to be serializable.


From: 颜发才(Yan Facai) <>
Sent: Saturday, March 11, 2017 6:48 AM
To: 萝卜丝炒饭
Cc: Mina Aslani; Ankur Srivastava;
Subject: Re: org.apache.spark.SparkException: Task not serializable

For scala,
make your class Serializable, like this
class YourClass extends Serializable {

On Sat, Mar 11, 2017 at 3:51 PM, 萝卜丝炒饭 <<>>
hi mina,

can you paste your new code here pleasel
i meet this issue too but do not get Ankur's idea.


From: "Mina Aslani"<<>>
Date: 2017/3/7 05:32:10
To: "Ankur Srivastava"<<>>;
Cc: "<>"<<>>;
Subject: Re: org.apache.spark.SparkException: Task not serializable

Thank you Ankur for the quick response, really appreciate it! Making the class serializable
resolved the exception!

Best regards,

On Mon, Mar 6, 2017 at 4:20 PM, Ankur Srivastava <<>>
The fix for this make your class Serializable. The reason being the closures you have defined
in the class need to be serialized and copied over to all executor nodes.

Hope this helps.


On Mon, Mar 6, 2017 at 1:06 PM, Mina Aslani <<>>


I am trying to start with spark and get number of lines of a text file in my mac, however
I get

org.apache.spark.SparkException: Task not serializable error on

JavaRDD<String> logData = javaCtx.textFile(file);

Please see below for the sample of code and the stackTrace.

Any idea why this error is thrown?

Best regards,


System.out.println("Creating Spark Configuration");
SparkConf javaConf = new SparkConf();
javaConf.setAppName("My First Spark Java Application");
javaConf.setMaster("PATH to my spark");
System.out.println("Creating Spark Context");
JavaSparkContext javaCtx = new JavaSparkContext(javaConf);
System.out.println("Loading the Dataset and will further process it");
String file = "file:///file.txt";
JavaRDD<String> logData = javaCtx.textFile(file);

long numLines = logData.filter(new Function<String, Boolean>() {
   public Boolean call(String s) {
      return true;

System.out.println("Number of Lines in the Dataset "+numLines);


Exception in thread "main" org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:387)
at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:386)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.filter(RDD.scala:386)

View raw message