NMF should be amenable to the stochastic decomposition trick. That reduces
the problem to a much smaller factorization that you could plausibly do
using sequential techniques. Jake Manning is working on getting
factorizations going, but I don't know if he has gotten to NMF.
Jake's comment about doing the multiplication cleanly using sparse
techniques is important, but you should also remember that if you have a
sparse matrix stored in triple form, doing a transpose is simply a matter of
changing which index your consider a row and which a column. For some
algorithms, it might pay to sort the result once so that you can do a
sortjoin for the multiplication, but that isn't hard.
On Tue, Dec 8, 2009 at 5:43 PM, Zhengguo 'Mike' SUN
<zhengguosun@yahoo.com>wrote:
> Hi Ted, John,
>
> I am also interested in matrix multiplication. Actually, I am implementing
> Nonnegative Matrix Factorization, which basically is iterations of matrix
> multiplication. What I did is exactly as what Ted said: doing an outer
> product using one column from left matrix and one row from right matrix and
> summing up all the results. Idealy, this could be done using only one job:
> mappers doing multiplication and reducers doing summing. But the thing is
> that how do you match a column from left matrix and a row from right matrix.
> Assuming that two input matrices are give in column major order and row
> major order, it seems to me that you have to implement a customized
> InputFormat to do the matching. Or maybe you could use one matrix as input
> and read the other inside mappers, but this also has problem when you have a
> lot of mappers. The straightforward way seems to be using two jobs. The
> mappers of the first job emit the id and the content of the row or column it
> reads and the reducers do the multiplication. The second job is to do the
> summing. This is kind of similar to what John did. But it is inefficient if
> you have many iterations. Any comments?
>
>
>
