mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avram Aelony <aav...@mac.com>
Subject Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
Date Wed, 09 Dec 2009 18:13:20 GMT

This is also a project called HAMA  <http://wiki.apache.org/hama/>.  
Perhaps your efforts can help further (or coordinate with) HAMA?

Best regards,
Avram


 
On Wednesday, December 09, 2009, at 10:05AM, "Ted Dunning" <ted.dunning@gmail.com> wrote:
>See here:
>
>http://arxiv.org/abs/0909.4061v1
>
>On Tue, Dec 8, 2009 at 6:47 PM, Zhengguo 'Mike' SUN
><zhengguosun@yahoo.com>wrote:
>
>> Hi Jake,
>>
>> I am implementing the classical multiplicative update rule of NMF. The
>> matrix to be factorized is really big and sparse. Are you suggesting that I
>> can use some specialised algorithms for sparse matrix instead of the
>> standard multiplication algorithm? But what algorithms are you referring to?
>> Could you please provide some pointers?
>>
>> Thanks
>>
>>
>>
>>
>> ________________________________
>> From: Jake Mannix <jake.mannix@gmail.com>
>> To: mahout-user@lucene.apache.org
>> Sent: Tue, December 8, 2009 8:53:39 PM
>> Subject: Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
>>
>> Hi Mike,
>>
>>   In the NMF work you're doing, the matrices you're multiplying are very
>> thin, right?  They're a handful of dense vectors on the left, an equal
>> number of dense vectors on the right, and you're only computing the values
>> in the outer product which coincide with the nonzero entries of the
>> (sparse)
>> input matrix?  In which case you're not really doing the full matrix
>> multiplication, and doing this a slightly specialized way should provide
>> some serious speedup.
>>
>>   Unless you're not dealing with the "really big, but sparse" case, and
>> minimizing error only over the nonzero inputs?
>>
>>   -jake
>>
>> On Tue, Dec 8, 2009 at 5:43 PM, Zhengguo 'Mike' SUN
>> <zhengguosun@yahoo.com>wrote:
>>
>> > Hi Ted, John,
>> >
>> > I am also interested in matrix multiplication. Actually, I am
>> implementing
>> > Non-negative Matrix Factorization, which basically is iterations of
>> matrix
>> > multiplication. What I did is exactly as what Ted said: doing an outer
>> > product using one column from left matrix and one row from right matrix
>> and
>> > summing up all the results. Idealy, this could be done using only one
>> job:
>> > mappers doing multiplication and reducers doing summing. But the thing is
>> > that how do you match a column from left matrix and a row from right
>> matrix.
>> > Assuming that two input matrices are give in column major order and row
>> > major order, it seems to me that you have to implement a customized
>> > InputFormat to do the matching. Or maybe you could use one matrix as
>> input
>> > and read the other inside mappers, but this also has problem when you
>> have a
>> > lot of mappers. The straightforward way seems to be using two jobs. The
>> > mappers of the first job emit the id and the content of the row or column
>> it
>> >  reads and the reducers do the multiplication. The second job is to do
>> the
>> > summing. This is kind of similar to what John did. But it is inefficient
>> if
>> > you have many iterations. Any comments?
>> >
>> >
>> >
>> >
>> > ________________________________
>> > From: Ted Dunning <ted.dunning@gmail.com>
>> > To: j-norstad@northwestern.edu; mahout-user <
>> mahout-user@lucene.apache.org
>> > >
>> > Sent: Tue, December 8, 2009 2:59:06 PM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> >
>> > John,
>> >
>> > Your comment about matrix multiplication was forwarded to the mahout-user
>> > emailing list.
>> >
>> > I have a question for you about your approach.  Have you considered doing
>> > the multiplication in a single step by storing the the first matrix in
>> > column major order and the second in row major order?  The idea would be
>> to
>> > do a block-wise outer product and use the combiner to sum elements ahead
>> of
>> > the reducer.  This could be done by reading a (block) column from the
>> left
>> > matrix and a (block) row from the second and emitting all of the (block)
>> > elements of the outer product of this column and row.  The key emitted
>> with
>> > these blocks would be the i,j index of the final location of each block
>> in
>> > the final product.  The combiner and reducer could add all of the blocks
>> > together and the use of the combiner would serve to substantially
>> minimize
>> > the amount of network traffic.
>> >
>> > This alternative is equivalent to inverting the loop nesting in a
>> > conventional matrix multiplication algorithm.  It is also very closely
>> > related to the way that cooccurrence counting is typically done in
>> Hadoop.
>> >
>> > Is your project an ongoing effort?
>> >
>> > ---------- Forwarded message ----------
>> > From: Robin Anil <robin.anil@gmail.com>
>> > Date: Tue, Dec 8, 2009 at 11:09 AM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> > To: mahout-user@lucene.apache.org
>> > Cc: John Norstad <j-norstad@northwestern.edu>
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: John Norstad <j-norstad@northwestern.edu>
>> > Date: Tue, Dec 8, 2009 at 10:14 PM
>> > Subject: A MapReduce Algorithm for Matrix Multiplication
>> > To: MapReduce <mapreduce-user@hadoop.apache.org>
>> >
>> >
>> > As an exercise while learning MapReduce, I developed an algorithm for
>> > matrix multiplication and wrote it up on my web site. If you're
>> > interested, it's at:
>> >
>> >  http://homepage.mac.com/j.norstad/matrix-multiply
>> >
>> > I present the algorithm in English, as pseudo-code, and as Java source
>> > code for Hadoop.
>> >
>> > --
>> > John Norstad
>> > Academic and Research Technologies
>> > Northwestern University
>> >
>> >
>> >
>> > --
>> > Ted Dunning, CTO
>> > DeepDyve
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>
>
>
>
>-- 
>Ted Dunning, CTO
>DeepDyve
>
>

Mime
View raw message