spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Un-serializable 3rd-party classes (Spark, Java)
Date Wed, 18 Jun 2014 05:14:14 GMT
There are a few options:

- Kryo might be able to serialize these objects out of the box, depending what’s inside
them. Try turning it on as described at

- If that doesn’t work, you can create your own “wrapper” objects that implement Serializable,
or even a subclass of FlexCompRowMatrix. No need to change the original library.

- If the library has its own serialization functions, you could also use those inside a wrapper
object. Take a look at
for an example where we make Hadoop’s Writables serializable.


On Jun 17, 2014, at 10:11 PM, Daedalus <> wrote:

> I'm trying to use  matrix-toolkit-java
> <>   for an application of
> mine, particularly ,the FlexCompRowMatrix class (used to store sparse
> matrices).
> I have a class Dataframe -- which contains and int array, two double values,
> and one FlexCompRowMatrix.
> When I try and run a simple Spark .foreach() on a JavaRDD created using a
> list of the above mentioned Dataframes, I get the following errors:
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due
> to s
> tage failure:* Task not serializable:
> pr.matrix.sparse.FlexCompRowMatrix*
>        at
> GScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
> The FlexCompRowMatrix doesn't seem to implement Serializable. This class
> suits my purpose very well, and I would prefer not to switch over from it.
> Other than writing code to make the class serializable, and then recompiling
> the matrix-toolkit-java source, what options do I have?
> Is there any workaround for this issue?
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at

View raw message