mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: ssvd pca
Date Wed, 31 Oct 2012 17:48:08 GMT
On Wed, Oct 31, 2012 at 8:21 AM, Perko, Ralph J <Ralph.Perko@pnnl.gov> wrote:
> I am using the mahout pca function to project a set of documents into 2-d space. From
what I understand, the pca function in mahout generates a [U] matrix using a

Yes, more specifically, you probably need U*Sigma output (-us true option)

[V] matrix for translation from the high-dimensional data
representation to the 2-d representation. Is there a way that I can
specify the [V] matrix if I already have one I would like for it to
use?

if you already have V, you don't need to run SVD. You need a result of
multiplication (A-M)V which is very closely same as U*Sigma output.
This is something else than what SSVD does. "SSVD  --pca true" finds
SVD of A-M (or A-Xi in the docs) which is a more complicated task than
just (A-M)V

You can compute (A-M) V with help of DistributedMatrix matrix
operations. There's column mean computation and matrix multiplication.
(not immediately sure if multiplication with vector subtraction can be
combined in one run).

>
> Thanks,
> Ralph
> __________________________________________________
> Ralph Perko
> Pacific Northwest National Laboratory
>
>

Mime
View raw message