mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <>
Subject Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
Date Wed, 09 Dec 2009 05:01:02 GMT
On Tue, Dec 8, 2009 at 7:27 PM, Zhengguo 'Mike' SUN

> No, I am not computing entries in X-WH. I am doing iterations on the
> following two rules:
> H <- H*(WtX)/(WtWH),      W<- W*(XHt)/(WHHt)
Hmm... I haven't thought about this for much more than this today, but let's

So if your X matrix is N by M, with each of the N rows having an average of
d non-empty
values, and you want a rank-k factorization, producing W'WH and WHH' is
going to take
something like O(k^2 * (N+M)) flops each, right?

And you need to iterate this again and again until convergence?  How many
iterations does this
tend to take?  It seems like this algorithm might not scale terribly well on
humongous data sets,
at least in comparison to what I know of the speed of e.g. SVD via Lanczos
or aGHA.

It also seems like this isn't going to respect HDFS locality very much,
because the full W and
H matrices are going to need to get moved around a lot, and continually
updated (ie replaced)
with newer values.   Maybe there are some tricks which can make this run
more efficiently,
but I would caution you to think carefully through how you parallelize this
- the most straightforward
approach seems like it is going to entail a *lot* of matrix multiplications.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message