Right, the compile error is a casting issue telling me I cannot assign a JavaPairRDD<Partition, Body> to a JavaPairRDD<Object, Object>. It happens in the mapToPair() method.




On 9 July 2014 19:52, Sean Owen <sowen@cloudera.com> wrote:
You forgot the compile error!


On Wed, Jul 9, 2014 at 6:14 PM, Silvina Caíno Lores <silvi.caino@gmail.com> wrote:
Hi everyone,

I am new to Spark and I'm having problems to make my code compile. I have the feeling I might be misunderstanding the functions so I would be very glad to get some insight in what could be wrong.

The problematic code is the following:

JavaRDD<Body> bodies = lines.map(l -> {Body b = new Body(); b.parse(l);} );

JavaPairRDD<Partition, Iterable<Body>> partitions =
                    bodies.mapToPair(b -> b.computePartitions(maxDistance)).groupByKey();


Partition and Body are defined inside the driver class. Body contains the following definition:

protected Iterable<Tuple2<Partition, Body>> computePartitions (int maxDistance)

The idea is to reproduce the following schema:

The first map results in: body1, body2, ... 
The mapToPair should output several of these: (partition_i, body1), (partition_i, body2)...
Which are gathered by key as follows: (partition_i, (body1, body_n....), (partition_i', (body2, body_n') ...

Thanks in advance.
Regards,
Silvina