spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daedalus <tushar.nagara...@gmail.com>
Subject Un-serializable 3rd-party classes (Spark, Java)
Date Wed, 18 Jun 2014 05:11:13 GMT
I'm trying to use  matrix-toolkit-java
<https://github.com/fommil/matrix-toolkits-java/>   for an application of
mine, particularly ,the FlexCompRowMatrix class (used to store sparse
matrices).

I have a class Dataframe -- which contains and int array, two double values,
and one FlexCompRowMatrix.

When I try and run a simple Spark .foreach() on a JavaRDD created using a
list of the above mentioned Dataframes, I get the following errors:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to s
tage failure:* Task not serializable: java.io.NotSerializableException:
no.uib.ci
pr.matrix.sparse.FlexCompRowMatrix*
        at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DA
GScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)

The FlexCompRowMatrix doesn't seem to implement Serializable. This class
suits my purpose very well, and I would prefer not to switch over from it.

Other than writing code to make the class serializable, and then recompiling
the matrix-toolkit-java source, what options do I have?

Is there any workaround for this issue?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Un-serializable-3rd-party-classes-Spark-Java-tp7815.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message