hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Update of "SSSP" by thomasjungblut
Date Fri, 24 Feb 2012 15:53:13 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "SSSP" page has been changed by thomasjungblut:

  == Single Source Shortest Paths ==
-  * The SSSP algorithm described in the Google Pregel paper was used.
+  * The SSSP (abbr. for Single Source Shortest Paths) algorithm described in the Google Pregel
paper was used.
   * Introduces IO usage, partitioning based on hashing of vertextID, and collective communication.
- == Implementation ==
- TODO: describe internal algorithm shows how it can be implemented using Hama BSP 
  == Usage ==
- TODO: 
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp <start vertex> <input path> <output
path> [number of tasks]
+ }}}
- Have fun! If you are facing problems, feel free to ask questions on the official mailing
+ You need to provide a start vertex name from where the computation should start calculating
the shortest paths, scroll down how to provide an input file for it.
+ == Submit your own graph ==
+ You can transform your graph as a adjacency list to fit into the input which Hama is going
to parse and calculate the SSSP.
+ The file that Hama can successfully parse is a TextFile that has the following layout:
+ {{{
+ Berlin	Frankfurt:20	Munich:50
+ Frankfurt	Berlin:20	Munich:10
+ Munich
+ }}}
+ This piece of text will adjacent Berlin to Frankfurt (with edge weight of 20) and Munich
(with edge weight of 10). Munich is a dangling node, it has no outlinks.
+ As you can see a vertex is always on the leftmost side (we call it the key-site), and the
outlinks (to which other vertex it is connected to) are seperated by tabs (\t) as the following
+ SSSP needs edge weights, you must provide them by separating the name of the vertex with
a colon ":". The weight must be an integer.
+ Make sure that every vertex's outlink can somewhere be found in the file as a key-site.
Otherwise it will result in weird NullPointerExceptions.
+ Now you need to transform the text file using:
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp-text2seq /tmp/input.txt /tmp/out/
+ }}}
+ Then you can run sssp on it with:
+ {{{
+ bin/hama jar ../hama-0.4.0-examples.jar sssp Berlin /tmp/out /tmp/sssp-output
+ }}}
+ Note that based on what you have configured, the paths may be in HDFS or on local disk.
+ == Output ==
+ After the job ran you can see a small snapshot of what the algorithm calculated, for the
textfile above you should see:
+ {{{
+ 12/02/24 16:47:48 INFO bsp.BSPJobClient: Current supersteps number: 5
+ 12/02/24 16:47:48 INFO bsp.BSPJobClient: The total number of supersteps: 5
+ Berlin | 0
+ Munich | 30
+ Frankfurt | 20
+ Job Finished in 4.018 seconds
+ }}}
+ On the left side you see your vertex name and on the right the cost which is needed to get
to that vertex.
+ In the output sequence file you should get a org.apache.hadoop.io.Text (KEY) and org.apache.hadoop.io.IntWritable
(VALUE) pair which is exactly the output from above.

View raw message