spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujit Pal <sujitatgt...@gmail.com>
Subject Re: How to distribute non-serializable object in transform task or broadcast ?
Date Fri, 07 Aug 2015 14:31:13 GMT
Hi Hao,

I think sc.broadcast will allow you to broadcast non-serializable objects.
According to the scaladocs the Broadcast class itself is Serializable and
it wraps your object, allowing you to get it from the Broadcast object
using value().

Not 100% sure though since I haven't tried broadcasting custom objects but
maybe worth trying unless you have already and failed.

-sujit


On Fri, Aug 7, 2015 at 2:39 AM, Hao Ren <invkrh@gmail.com> wrote:

> Is there any workaround to distribute non-serializable object for RDD
> transformation or broadcast variable ?
>
> Say I have an object of class C which is not serializable. Class C is in a
> jar package, I have no control on it. Now I need to distribute it either by
> rdd transformation or by broadcast.
>
> I tried to subclass the class C with Serializable interface. It works for
> serialization, but deserialization does not work, since there are no
> parameter-less constructor for the class C and deserialization is broken
> with an invalid constructor exception.
>
> I think it's a common use case. Any help is appreciated.
>
> --
> Hao Ren
>
> Data Engineer @ leboncoin
>
> Paris, France
>

Mime
View raw message