spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <>
Subject Re: Adjacency List representation in Spark
Date Wed, 17 Sep 2014 17:14:15 GMT
Hi Harsha,

You could look through the GraphX source to see the approach taken there
for ideas in your own.  I'd recommend starting at
to see the storage technique.

Why do you want to avoid using GraphX?

Good luck!

On Wed, Sep 17, 2014 at 6:43 AM, Harsha HN <> wrote:

> Hello
> We are building an adjacency list to represent a graph. Vertexes, Edges
> and Weights for the same has been extracted from hdfs files by a Spark job.
> Further we expect size of the adjacency list(Hash Map) could grow over
> 20Gigs.
> How can we represent this in RDD, so that it will distributed in nature?
> Basically we are trying to fit HashMap(Adjacency List) into Spark RDD. Is
> there any other way other than GraphX?
> Thanks and Regards,
> Harsha

View raw message