spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Dave <ankurd...@gmail.com>
Subject Re: GraphX Pagerank application
Date Sat, 16 Aug 2014 01:54:36 GMT
On Wed, Aug 6, 2014 at 11:37 AM, AlexanderRiggers <
alexander.riggers@gmail.com> wrote:

> To perform the page rank I have to create a graph object, adding the edges
> by setting sourceID=id and distID=brand. In GraphLab there is function: g =
> SGraph().add_edges(data, src_field='id', dst_field='brand')
>
> Is there something similar in GraphX?


It sounds like you're trying to parse an edge list file into a graph, where
each line is a comma-separated pair of numeric vertex ids. There's a
built-in parser for tab-separated pairs (see GraphLoader) and it should be
easy to adapt that to comma-separated pairs. You can also drop the header
line using RDD#filter (and eventually using
https://github.com/apache/spark/pull/1839).

Ankur <http://www.ankurdave.com/>

Mime
View raw message