spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James <>
Subject [GraphX] Estimating Average distance of a big graph using GraphX
Date Wed, 11 Feb 2015 11:30:19 GMT

Recently  I am trying to estimate the average distance of a big graph using
spark with the help of [HyperAnf](

It works like Connect Componenet algorithm, while the attribute of a vertex
is a HyperLogLog counter that at k-th iteration it estimates the number of
vertices it could reaches less than k hops.

I have successfully run the code on a graph with 20M vertices. But I still
need help:

*I think the code could work more efficiently especially the "Send message"
function, but I am not sure about what will happen if a vertex receive no
message at a iteration.*

Here is my code:

Any returns is appreciated.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message