Please supply a bit more information.
First, how many average nonzero elements per row do you have?
Secondly, how many nonzero values is in the most full row?
How did you configure your mapreduce program?
What kind of machines were you running on?
> Actually, I would like to perform the spectral clustering on a large scale
> sparse matrix, but it failed due to the OutOfMemory error when creating the
> DenseMatrix for SVD decomposition.
> Best
> Wei
> > SSVD != Lanczos. if you do PCA or LSI it is perhaps what you need. it
> > can take on these things. Well at least some of my branches can, if
> > not the official patch.
> > d
> > > thanks for your reply
> > > my matrix is not very dense, a sparse matrix.
> > >
> > > I have tried the svd of Mahout, but failed due to the OutOfMemory
> error.
> > >
> > > Best
> > > Wei
> > >
> > >> you can certainly try to write it out into a DRM (distributed row
> > >> matrix) and run stochastic SVD on hadoop (off the trunk now). see
> > >> MAHOUT593. This is suitable if you have a good decay of singular
> > >> values (but if you don't it probably just means you have so much noise
> > >> that it masks the problem you are trying to solve in your data).
> > >>
> > >> Current committed solution is not most efficient yet, but it should be
> > >> quite capable.
> > >>
> > >> If you do, let me know how it went.
> > >>
> > >> thanks.
> > >> d
> > >> > Are you sure your matrix is dense?
> > >> >
> > >> >> Hi All:
> > >> >> is it possible to compute the SVD factorization for a 600,000
> > >> >> 600,000
> > >> >> matrix using Mahout?
> > >> >> I have got the OutOfMemory error when creating the DenseMatrix.
> > >> >> Best
> > >> >> Wei
> > >> >>
