spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: All pairs shortest paths?
Date Thu, 27 Mar 2014 00:59:05 GMT
Yeah, if you’re just worried about statistics, maybe you can do sampling (do single-pair
paths from 100 random nodes and you get an idea of what percentage of nodes have what distribution
of neighbors in a given distance).


On Mar 26, 2014, at 5:55 PM, Ryan Compton <> wrote:

> Much thanks, I suspected this would be difficult. I was hoping to
> generate some "4 degrees of separation"-like statistics. Looks like
> I'll just have to work with a subset of my graph.
> On Wed, Mar 26, 2014 at 5:20 PM, Matei Zaharia <> wrote:
>> All-pairs distances is tricky for a large graph because you need O(V^2) storage.
Do you want to just quickly query the distance between two vertices? In that case you can
do single-source shortest paths, which I believe exists in GraphX, or at least is very quick
to implement on top of its Pregel API. If your graph is small enough that storing all-pairs
is feasible, you can probably run this as an iterative algorithm:–Warshall_algorithm,
though I haven’t tried it. It may be tough to do with GraphX.
>> Matei
>> On Mar 26, 2014, at 3:51 PM, Ryan Compton <> wrote:
>>> To clarify: I don't need the actual paths, just the distances.
>>> On Wed, Mar 26, 2014 at 3:04 PM, Ryan Compton <>
>>>> No idea how feasible this is. Has anyone done it?

View raw message