spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Czech <alexander.cz...@googlemail.com.INVALID>
Subject How to use the Graphframe PageRank method with dangling edges?
Date Mon, 05 Nov 2018 10:20:16 GMT
I have graph that has a couple of dangling edges. I use pyspark and work
with spark 2.2.0. It kind of looks like this:

g.vertices.show()
+---+
| id|
+---+
|  1|
|  2|
|  3|
|  4|
+---+
g.edges.show()
+---+----+
|src| dst|
+---+----+
|  1|   2|
|  2|   3|
|  3|   4|
|  4|   1|
|  4|null|
+---+----+

Now when I call g.pageRank(resetProbability=0.15, tol=0.01)it
obviously fails because one edge is pointing towards null.
What I want is that every page rank weight that is distributed towards
a dangling edge is randomly distrusted to a node in the graph much
like the dampening factor or resetProbability in the pagerank
function. Is this possible without rewriting the pagerank method?
Because my Scala knowledge is zero and reimplemeting it in phyton is
probably a pretty slow solution.

Thanks

Mime
View raw message