spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: Using Spark Context as an attribute of a class cannot be used
Date Mon, 24 Nov 2014 22:12:36 GMT
On Mon, Nov 24, 2014 at 1:56 PM, aecc <alessandroaecc@gmail.com> wrote:
> I checked sqlContext, they use it in the same way I would like to use my
> class, they make the class Serializable with transient. Does this affects
> somehow the whole pipeline of data moving? I mean, will I get performance
> issues when doing this because now the class will be Serialized for some
> reason that I still don't understand?

If you want to do the same thing, your "AAA" needs to be serializable
and you need to mark all non-serializable fields as "@transient". The
only performance penalty you'll be paying is the serialization /
deserialization of the "AAA" instance, which most probably will be
really small compared to the actual work the task will be doing.

Unless your class is holding a whole lot of data, in which case you
should start thinking about using a broadcast instead.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message