Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/hama/Architecture

[http://wiki.apache.org/hamadata/attachments/Architecture/attachments/block.png]
 == Store Dense/Sparse Matrices ==
+ = Store Dense/Sparse Matrices =
To store the matrices, Hama use a [http://hadoop.apache.org/hbase/ Hbase]  Matrices are
basically tables. They are ways of storing numbers and other things. Typical matrix has rows
and columns. Actually called a 2way matrix because it has two dimensions. For example, you
might have respondentsbyattitudes. Of course, you might collect the same data on the same
people at 5 points in time. In that case, you either have 5 different 2way matrices, or you
could think of it as a 3way matrix, that is respondentbyattitudebytime.
 ''Just a thought, considering the depleted activity in HBase should we not explore ways
to avoid HBase ? Prasen ''
 == Perform matrix operations ==
+ = Perform matrix operations =
The Hadoop/Hbase is designed to efficiently process large data set by connecting many commodity
computers together to work in parallel but, If there's a internode communication, the elapsed
run time will be slower with more nodes. Consequently, an "effective" algorithm should avoid
large amounts of communication.
+ == Dense MatrixMatrix multiplication ==
+
+ Blocking jobs:
+
+ * Collect the blocks to 'collectionTable' from A and B.
+ * A map task receives a row n as a key, and vector as its value
+ * emit (blockID, subvector)
+ * Reduce task combines block
+
+ Multiplication job:
+
+ * A map task receives a blockID n as a key, and two submatrices as its value
+ * Reduce task computes sum of blocks
+
+ == Computes maximum absolute row sum ==
+
+ * A map task receives a row n as a key, and vector as its value
+ * emit (row, the sum of the absolute value of each entries)
+ * Reduce task selects the maximum one
+
