spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugene Morozov <fathers...@list.ru>
Subject Re: KryoSerializer gives class cast exception
Date Mon, 20 Jul 2015 10:53:57 GMT
Josh, thanks for the reply.

So, it looks like despite the progress there is no other way as to fork and fix the chill
itself. It indeed doesn’t compile with kryo 2.24.0, but it wasn’t that hard to fix (looks
like I’ve just guessed the right code), although there are test failures now.

On 17 Jul 2015, at 18:15, Josh Rosen <rosenville@gmail.com> wrote:

> We've run into other problems caused by our old Kryo versions. I agree that the Chill
dependency is one of the main blockers to upgrading Kryo, but I don't think that it's insurmountable:
if necessary, we could just publish our own forked version of Chill under our own namespace,
similar to what we used to do with Pyrolite.
> 
> A bigger concern, perhaps, is dependency conflicts with user-specified Kryo versions.
> 
> See https://github.com/apache/spark/pull/6361 and https://issues.apache.org/jira/browse/SPARK-7708
for some more previous discussions RE: Kryo upgrade.
> 
> Anyhow, I'm not sure what the right solution is yet, but just wanted to link to some
previous context / discussions.
> 
> - Josh 
> 
> On Thu, Jul 16, 2015 at 7:57 AM, Eugene Morozov <fathersson@list.ru> wrote:
> Hi, some time ago we’ve found that it’s better use Kryo serializer instead of Java
one.
> So, we turned it on and use it everywhere.
> 
> I have pretty complex objects, which I can’t change. Previously my algo was building
such an objects and then storing them into external storage. It was not required to reshuffle
partitions. Now, it seems I have to reshuffle them, but I’m stuck with ClassCastException.
I investigated it a little and it seems to me that KryoSerializer does not clear it’s state
at some point, so it tries to use StringSerializer for my non String object. My objects are
pretty complex, it’d be pretty hard to make them serializable.
> 
> Caused by: java.lang.ClassCastException: com.company.metadata.model.cleanse.CleanseInfoSequence
cannot be cast to java.lang.String
> 	at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)
> 	at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549)
> 	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:68)
> 	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
> 	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:501)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> 	... 71 more
> 
> I’ve found this state issue in Kryo jira and that it’s been fixed after 2.21 (current
kryo version in spark). But spark cannot update, because of chill and chill cannot be updated
because of some dependencies on their side. So, spark sort of stuck with kryo version 2.21.
> 
> My own thoughts how I could workaround this
> 1. Rewrite algo, so that my objects shouldn’t be reshuffled. But at some point it’be
required.
> 2. Make my objects implement Serializable and be stuck with java serialization forever.
> 3. My object inside of kryo looks like ArrayList with my object, so I’m not sure it’s
possible to register my class with custom serializer in kryo.
> 
> Any advice would be highly appreciated. 
> Thanks.
> --
> Eugene Morozov
> fathersson@list.ru
> 
> 
> 
> 
> 

Eugene Morozov
fathersson@list.ru





Mime
View raw message