spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dizzy5112 <dave.zee...@gmail.com>
Subject graphx and subgraph query
Date Fri, 01 Aug 2014 06:42:42 GMT
Hi I have a small problem using graphx. I have a graph whose triplets are
represented as:
((101,User({101=0},0,3)),(104,User({101=1},1,0)),1)
((101,User({101=0},0,3)),(105,User({101=1},1,0)),2)
((102,User({102=0},0,3)),(106,User({102=1},1,0)),3)
((102,User({102=0},0,3)),(107,User({102=1},1,1)),4)
((107,User({102=1},1,1)),(108,User({102=2},1,1)),5)
((108,User({102=2},1,1)),(109,User({102=3, 103=1, 101=1},3,0)),6)
((101,User({101=0},0,3)),(109,User({102=3, 103=1, 101=1},3,0)),7)
((103,User({103=0},0,3)),(110,User({103=1},1,0)),8)
((103,User({103=0},0,3)),(111,User({103=1},1,1)),9)
((103,User({103=0},0,3)),(109,User({102=3, 103=1, 101=1},3,0)),10)
((111,User({103=1},1,1)),(112,User({102=1, 103=2, 113=1},3,0)),11)
((113,User({113=0},0,1)),(112,User({102=1, 103=2, 113=1},3,0)),11)
((102,User({102=0},0,3)),(112,User({102=1, 103=2, 113=1},3,0)),12)

this is the node id, a hashmap showing the rootnode and the level of the
graph its on, its in degrees and its out degrees). For the srcAttr the
hashmap is itself and the level its sits on. the dstAtrr is the root nodes
of the paths its connected to again with in degrees and out degrees.

In the first instance i create a subgraph by only including the very root
nodes and those which have no out links with the following.
val requiredNodes = userGraph.subgraph(vpred = (id,user) => user.inDeg==0 |
user.outDeg==0)

i now need to go one step further and remove all triplets which have links
to root nodes whose graph goes down more than one level. I can identify
these using 
val delNodes =
userGraph.triplets.map(x=>x.dstAttr.details).flatMap(x=>x).distinct.filter(x=>x._2
>1).map(x=>x._1).distinct

which gives me 102 and 103

the final triplet result im looking for should be 
((101,User({101=0},0,3)),(104,User({101=1},1,0)),1)
((101,User({101=0},0,3)),(105,User({101=1},1,0)),2)
((101,User({101=0},0,3)),(109,User({102=3, 103=1, 101=1},3,0)),7)

i can get this using 
userGraph.triplets.filter(x=>x.srcAttr.details.keys==Set(101)).collect.foreach(println)
but the problem is i can only have 1 value in the set part (Set(101)). The
graph in this case should remove all those = 102 and 103 but cant seem to
get that to work in this part. 

Ideally i would like to be able to include this exlusion in the subraph part
if possible or if not be able to have more than one value in the last
filter. Anyone able to point me in the right direction.






--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/graphx-and-subgraph-query-tp11140.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message