mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danny Bickson <danny.bick...@gmail.com>
Subject Re: Need a little help with SVD / Dimensional Reduction
Date Mon, 06 Jun 2011 17:36:49 GMT
If I understand your question correctly, you need to simply transpose M
before you start the run and that way
you will get the other singular vectors.

May I ask what is the problem you are working on and why do you need the
singular vectors?
Can you consider using another matrix decomposition technique for example
alternating least squares
which gives you two lower rank matrices which simulates the large decomposed
matrix?

On Mon, Jun 6, 2011 at 1:30 PM, Stefan Wienert <stefan@wienert.cc> wrote:

> Hi Danny!
>
> I understand that for M*M' (and for M'*M) the left and right
> eigenvectors are identical. But that is not exactly what I want. The
> lanczos solver from mahout gives me the eigenvectors of M*M', which
> are the left singular vectors of M. But I need the right singular
> vectors of M (and not M*M'). How do I get them?
>
> Sorry, my matrix math is not as good as it should be, but I hope you
> can help me!
>
> Thanks,
> Stefan
>
> 2011/6/6 Danny Bickson <danny.bickson@gmail.com>:
> > Hi Stefan!
> > For a positive semidefinite matrix, the lest and right eigenvectors are
> > identical.
> > See SVD wikipeida text: When *M* is also positive
> > semi-definite<http://en.wikipedia.org/wiki/Positive-definite_matrix>,
> > the decomposition *M* = *U**D**U* * is also a singular value
> decomposition.
> > So you don't need to be worried about the other singular vectors.
> >
> > Hope this helps!
> >
> > On Mon, Jun 6, 2011 at 12:57 PM, Stefan Wienert <stefan@wienert.cc>
> wrote:
> >
> >> Hi.
> >>
> >> Thanks for the help.
> >>
> >> The important points from wikipedia are:
> >> - The left singular vectors of M are eigenvectors of M*M' .
> >> - The right singular vectors of M are eigenvectors of M'*M.
> >>
> >> as you describe, the mahout lanczos solver calculate A=M'*M (I think
> >> it does A=M*M', but it is not a problem). Therefore it does already
> >> calculate the right (or left) singular vector of M.
> >>
> >> But my question is, how can I get the other singular vector? I can
> >> transpose M, but then I have to calculated two SVDs, one for the right
> >> and one for the left singular value... I think there is a better way
> >> :)
> >>
> >> Hope you can help me with this...
> >> Thanks
> >> Stefan
> >>
> >>
> >> 2011/6/6 Danny Bickson <danny.bickson@gmail.com>:
> >> > Hi
> >> > Mahout SVD implementation computes the Lanzcos iteration:
> >> > http://en.wikipedia.org/wiki/Lanczos_algorithm
> >> > Denote the non-square input matrix as M. First a symmetric matrix A is
> >> > computed by A=M'*M
> >> > Then an approximating tridiagonal matrix T and a vector matrix V are
> >> > computed such that A =~ V*T*V'
> >> > (this process is done in a distributed way).
> >> >
> >> > Next the matrix T is next decomposed into eigenvectors and
> eignevalues.
> >> > Which is the returned result. (This process
> >> > is serial).
> >> >
> >> > The third step makes the returned eigenvectors orthogonal to each
> other
> >> > (which is optional IMHO).
> >> >
> >> > The heart of the code is found at:
> >> >
> >>
> ./math/src/main/java/org/apache/mahout/math/decomposer/lanczos/LanczosSolver.java
> >> > At least that is where it was in version 0.4 I am not sure if there
> are
> >> > changes in version 0.5
> >> >
> >> > Anyway, Mahout does not compute directly SVD. If you are interested in
> >> > learning more about the relation to SVD
> >> > look at: http://en.wikipedia.org/wiki/Singular_value_decomposition,
> >> > subsection: relation to eigenvalue decomposition.
> >> >
> >> > Hope this helps,
> >> >
> >> > Danny Bickson
> >> >
> >> > On Mon, Jun 6, 2011 at 9:35 AM, Stefan Wienert <stefan@wienert.cc>
> >> wrote:
> >> >
> >> >> After reading this thread:
> >> >>
> >> >>
> >>
> http://mail-archives.apache.org/mod_mbox/mahout-user/201102.mbox/%3CAANLkTinQ5K4XrM7naBWn8qoBXZGVobBot2RtjZSV4yOd@mail.gmail.com%3E
> >> >>
> >> >> Wiki-SVD: M = U S V* (* = transposed)
> >> >>
> >> >> The output of Mahout-SVD is (U S) right?
> >> >>
> >> >> So... How do I get V from (U S)  and M?
> >> >>
> >> >> Is V = M (U S)* (because this is, what the calculation in the example
> >> is)?
> >> >>
> >> >> Thanks
> >> >> Stefan
> >> >>
> >> >> 2011/6/6 Stefan Wienert <stefan@wienert.cc>:
> >> >> >
> >>
> https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction
> >> >> >
> >> >> > What is done:
> >> >> >
> >> >> > Input:
> >> >> > tf-idf-matrix (docs x terms) 6076937 x 20444
> >> >> >
> >> >> > "SVD" of tf-idf-matrix (rank 100) produces the eigenvector (and
> >> >> > eigenvalues) of tf-idf-matrix, called:
> >> >> > svd (concepts x terms) 87 x 20444
> >> >> >
> >> >> > transpose tf-idf-matrix:
> >> >> > tf-idf-matrix-transpose (terms x docs) 20444 x 6076937
> >> >> >
> >> >> > transpose svd:
> >> >> > svd-transpose (terms x concepts) 20444 x 87
> >> >> >
> >> >> > matrix multiply:
> >> >> > tf-idf-matrix-transpose x svd-transpose = result
> >> >> > (terms x docs) x (terms x concepts) = (docs x concepts)
> >> >> >
> >> >> > so... I do understand, that the "svd" here is not SVD from
> wikipedia.
> >> >> > It only does the Lanczos algorithm and some magic which produces
> the
> >> >> >> Instead either the left or right (but usually the right)
> eigenvectors
> >> >> premultiplied by the diagonal or the square root of the
> >> >> >> diagonal element.
> >> >> > from
> >> >>
> >>
> http://mail-archives.apache.org/mod_mbox/mahout-user/201102.mbox/%3CAANLkTi=Rta7tfRm8Zi60VcFya5xF+dbFrJ8pcds2N0-V@mail.gmail.com%3E
> >> >> >
> >> >> > so my question: what is the output of the SVD in mahout. And what
> do I
> >> >> > have to calculate to get the "right singular value" from svd?
> >> >> >
> >> >> > Thanks,
> >> >> > Stefan
> >> >> >
> >> >> > 2011/6/6 Stefan Wienert <stefan@wienert.cc>:
> >> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction
> >> >> >>
> >> >> >> the last step is the matrix multiplication:
> >> >> >>  --arg --numRowsA --arg 20444 \
> >> >> >>  --arg --numColsA --arg 6076937 \
> >> >> >>  --arg --numRowsB --arg 20444 \
> >> >> >>  --arg --numColsB --arg 87 \
> >> >> >> so the result is a 6,076,937 x 87 matrix
> >> >> >>
> >> >> >> the input has 6,076,937 (each with 20,444 terms). so the result
of
> >> >> >> matrix multiplication has to be the right singular value regarding
> to
> >> >> >> the dimensions.
> >> >> >>
> >> >> >> so the result is the "concept-document vector matrix" (as
I think,
> >> >> >> these is also called "document vectors" ?)
> >> >> >>
> >> >> >> 2011/6/6 Ted Dunning <ted.dunning@gmail.com>:
> >> >> >>> Yes.  These are term vectors, not document vectors.
> >> >> >>>
> >> >> >>> There is an additional step that can be run to produce
document
> >> >> vectors.
> >> >> >>>
> >> >> >>> On Sun, Jun 5, 2011 at 1:16 PM, Stefan Wienert <stefan@wienert.cc
> >
> >> >> wrote:
> >> >> >>>
> >> >> >>>> compared to SVD, is the result is the "right singular
value"?
> >> >> >>>>
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Stefan Wienert
> >> >> >>
> >> >> >> http://www.wienert.cc
> >> >> >> stefan@wienert.cc
> >> >> >>
> >> >> >> Telefon: +495251-2026838
> >> >> >> Mobil: +49176-40170270
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Stefan Wienert
> >> >> >
> >> >> > http://www.wienert.cc
> >> >> > stefan@wienert.cc
> >> >> >
> >> >> > Telefon: +495251-2026838
> >> >> > Mobil: +49176-40170270
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Stefan Wienert
> >> >>
> >> >> http://www.wienert.cc
> >> >> stefan@wienert.cc
> >> >>
> >> >> Telefon: +495251-2026838
> >> >> Mobil: +49176-40170270
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Stefan Wienert
> >>
> >> http://www.wienert.cc
> >> stefan@wienert.cc
> >>
> >> Telefon: +495251-2026838
> >> Mobil: +49176-40170270
> >>
> >
>
>
>
> --
> Stefan Wienert
>
> http://www.wienert.cc
> stefan@wienert.cc
>
> Telefon: +495251-2026838
> Mobil: +49176-40170270
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message