That is unfortunately the way how Scala compiler captures (and defines) closures. Nothing is really final in the JVM. You can always use reflection or unsafe to modify the value of fields. 

But does the “notSer” object have to be serialized?


The object is immutable by the definition of A, so the only thing that needs to be serialized is the (immutable) Int value? And Ints are serializable?


Yes, it is.

You can define a udf like that.

Basically, it's a udf Int => Int which is a closure contains a non serializable object.

The latter should cause Task not serializable exception.




Hello Hao Ren,


Doesn't the code...


val add = udf {

      (a: Int) => a + notSer.value


Mean UDF function that Int => Int ?





I am playing with spark 2.0

What I tried to test is:


Create a UDF in which there is a non serializable object.

What I expected is when this UDF is called during materializing the dataFrame where the UDF is used in "select", an task non serializable exception should be thrown. 

It depends also which "action" is called on that dataframe.


Here is the code for reproducing the pb:



object DataFrameSerDeTest extends App {


  class A(val value: Int) // It is not serializable


  def run() = {

    val spark = SparkSession






    import org.apache.spark.sql.functions.udf

    import spark.sqlContext.implicits._


    val notSer = new A(2)

    val add = udf {

      (a: Int) => a + notSer.value


    val df = spark.createDataFrame(Seq(

      (1, 2),

      (2, 2),

      (3, 2),

      (4, 2)

    )).toDF("key", "value")

      .select($"key", add($"value").as("added")) // It should not work because the udf contains a non-serializable object, but it works


    df.filter($"key" === 2).show() // It does not work as expected (org.apache.spark.SparkException: Task not serializable)







Also, I tried collect(), count(), first(), limit(). All of them worked without non-serializable exceptions.

It seems only filter() throws the exception. (feature or bug ?)


Any ideas ? Or I just messed things up ?

