mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhengguo 'Mike' SUN <zhengguo...@yahoo.com>
Subject Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
Date Wed, 09 Dec 2009 01:43:29 GMT
Hi Ted, John,

I am also interested in matrix multiplication. Actually, I am implementing Non-negative Matrix
Factorization, which basically is iterations of matrix multiplication. What I did is exactly
as what Ted said: doing an outer product using one column from left matrix and one row from
right matrix and summing up all the results. Idealy, this could be done using only one job:
mappers doing multiplication and reducers doing summing. But the thing is that how do you
match a column from left matrix and a row from right matrix. Assuming that two input matrices
are give in column major order and row major order, it seems to me that you have to implement
a customized InputFormat to do the matching. Or maybe you could use one matrix as input and
read the other inside mappers, but this also has problem when you have a lot of mappers. The
straightforward way seems to be using two jobs. The mappers of the first job emit the id and
the content of the row or column it
 reads and the reducers do the multiplication. The second job is to do the summing. This is
kind of similar to what John did. But it is inefficient if you have many iterations. Any comments?




________________________________
From: Ted Dunning <ted.dunning@gmail.com>
To: j-norstad@northwestern.edu; mahout-user <mahout-user@lucene.apache.org>
Sent: Tue, December 8, 2009 2:59:06 PM
Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication

John,

Your comment about matrix multiplication was forwarded to the mahout-user
emailing list.

I have a question for you about your approach.  Have you considered doing
the multiplication in a single step by storing the the first matrix in
column major order and the second in row major order?  The idea would be to
do a block-wise outer product and use the combiner to sum elements ahead of
the reducer.  This could be done by reading a (block) column from the left
matrix and a (block) row from the second and emitting all of the (block)
elements of the outer product of this column and row.  The key emitted with
these blocks would be the i,j index of the final location of each block in
the final product.  The combiner and reducer could add all of the blocks
together and the use of the combiner would serve to substantially minimize
the amount of network traffic.

This alternative is equivalent to inverting the loop nesting in a
conventional matrix multiplication algorithm.  It is also very closely
related to the way that cooccurrence counting is typically done in Hadoop.

Is your project an ongoing effort?

---------- Forwarded message ----------
From: Robin Anil <robin.anil@gmail.com>
Date: Tue, Dec 8, 2009 at 11:09 AM
Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
To: mahout-user@lucene.apache.org
Cc: John Norstad <j-norstad@northwestern.edu>


---------- Forwarded message ----------
From: John Norstad <j-norstad@northwestern.edu>
Date: Tue, Dec 8, 2009 at 10:14 PM
Subject: A MapReduce Algorithm for Matrix Multiplication
To: MapReduce <mapreduce-user@hadoop.apache.org>


As an exercise while learning MapReduce, I developed an algorithm for
matrix multiplication and wrote it up on my web site. If you're
interested, it's at:

  http://homepage.mac.com/j.norstad/matrix-multiply

I present the algorithm in English, as pseudo-code, and as Java source
code for Hadoop.

--
John Norstad
Academic and Research Technologies
Northwestern University



-- 
Ted Dunning, CTO
DeepDyve



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message