spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "张林(林岳)" <linyue...@alibaba-inc.com>
Subject srcAttr in graph.triplets don't update when the size of graph is huge
Date Wed, 18 Mar 2015 12:29:40 GMT
when the size of the graph is huge(0.2 billion vertex, 6 billion edges), the
srcAttr and dstAttr in graph.triplets don't update when using the
Graph.outerJoinVertices(when the data in vertex is changed).

the code and the log is as follows:

g = graph.outerJoinVertices()...
g,vertices,count()
g.edges.count()
println("example edge " + g.triplets.filter(e => e.srcId ==
5000000001L).collect()
.map(e =>(e.srcId + ":" + e.srcAttr + ", " + e.dstId + ":" +
e.dstAttr)).mkString("\n"))
println("example vertex " + g.vertices.filter(e => e._1 ==
5000000001L).collect()
.map(e => (e._1 + "," + e._2)).mkString("\n"))

the result:

example edge 5000000001:0, 2467451620:61
5000000001:0, 1962741310:83 // attr of vertex 5000000001 is 0 in
Graph.triplets
example vertex 5000000001,2 // attr of vertex 5000000001 is 2 in
Graph.vertices

when the graph is smaller(10 million vertex), the code is OK, the triplets
will update when the vertex is changed

 


Mime
View raw message