spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Han JU <>
Subject No access to pairRDDFunctions
Date Thu, 26 Sep 2013 13:25:12 GMT

I have some classes like

abstract class RawData[+K, +V](id: K, data: V) extends Tuple2[K, V](uid,

case class SomeData(id: Int, data: Data) extends RawData[Int, Data](id,

to model some input data.

Then I find out that RDD[SomeData] doesn't have access to pairRDDFunctions,
like join. But SomeData is indeed a subclass of Tuple2.

I guess that the problem comes from the invariance of T in RDD[T], and
RDD[SomeData] is not a subclass of RDD[Tuple2] so the implicit conversion
won't work.


1) how could I work this around? How do you model data of lots of fields
that need to be joined? I don't really want to have things like "_._2._2"
but rather "" or "".

2) is there some reason for invariance of T in RDD? could it be covariant?


*JU Han*

Data Engineer @

+33 0619608888

View raw message