spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: internal unit tests failing against the latest spark master
Date Wed, 12 Apr 2017 22:11:49 GMT
i confirmed that an Encoder[Array[Int]] is no longer serializable, and with
my spark build from march 7 it was.

i believe the issue is commit 295747e59739ee8a697ac3eba485d3439e4a04c3 and
i send wenchen an email about it.

On Wed, Apr 12, 2017 at 4:31 PM, Koert Kuipers <koert@tresata.com> wrote:

> i believe the error is related to an org.apache.spark.sql.expressions.Aggregator
> where the buffer type (BUF) is Array[Int]
>
> On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <koert@tresata.com> wrote:
>
>> hey all,
>> today i tried upgrading the spark version we use internally by creating a
>> new internal release from the spark master branch. last time i did this was
>> march 7.
>>
>> with this updated spark i am seeing some serialization errors in the unit
>> tests for our own libraries. looks like a scala reflection type that is not
>> serializable is getting sucked into serialization for the encoder?
>> see below.
>> best,
>> koert
>>
>> [info]   org.apache.spark.SparkException: Task not serializable
>> [info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(Clo
>> sureCleaner.scala:298)
>> [info]   at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$
>> ClosureCleaner$$clean(ClosureCleaner.scala:288)
>> [info]   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.
>> scala:108)
>> [info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2284)
>> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058)
>> ...
>> [info] Serialization stack:
>> [info]     - object not serializable (class:
>> scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq, value:
>> BTS(Int,AnyVal,Any))
>> [info]     - field (class: scala.reflect.internal.Types$TypeRef, name:
>> baseTypeSeqCache, type: class scala.reflect.internal.BaseTyp
>> eSeqs$BaseTypeSeq)
>> [info]     - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef,
>> Int)
>> [info]     - field (class: org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6,
>> name: elementType$2, type: class scala.reflect.api.Types$TypeApi)
>> [info]     - object (class org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6,
>> <function1>)
>> [info]     - field (class: org.apache.spark.sql.catalyst.
>> expressions.objects.UnresolvedMapObjects, name: function, type:
>> interface scala.Function1)
>> [info]     - object (class org.apache.spark.sql.catalyst.
>> expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(<function1>,
>> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface
>> scala.collection.Seq)))
>> [info]     - field (class: org.apache.spark.sql.catalyst.
>> expressions.objects.WrapOption, name: child, type: class
>> org.apache.spark.sql.catalyst.expressions.Expression)
>> [info]     - object (class org.apache.spark.sql.catalyst.
>> expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(<function1>,
>> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface
>> scala.collection.Seq)), ObjectType(interface scala.collection.Seq)))
>> [info]     - writeObject data (class: scala.collection.immutable.Lis
>> t$SerializationProxy)
>> [info]     - object (class scala.collection.immutable.List$SerializationProxy,
>> scala.collection.immutable.List$SerializationProxy@69040c85)
>> [info]     - writeReplace data (class: scala.collection.immutable.Lis
>> t$SerializationProxy)
>> [info]     - object (class scala.collection.immutable.$colon$colon,
>> List(wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0,
>> ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)),
>> ObjectType(interface scala.collection.Seq))))
>> [info]     - field (class: org.apache.spark.sql.catalyst.
>> expressions.objects.NewInstance, name: arguments, type: interface
>> scala.collection.Seq)
>> [info]     - object (class org.apache.spark.sql.catalyst.
>> expressions.objects.NewInstance, newInstance(class scala.Tuple1))
>> [info]     - field (class: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder,
>> name: deserializer, type: class org.apache.spark.sql.catalyst.
>> expressions.Expression)
>> [info]     - object (class org.apache.spark.sql.catalyst.encoders.ExpressionEncoder,
>> class[_1[0]: array<int>])
>> ...
>>
>>
>

Mime
View raw message