mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chirag Lakhani <clakh...@zaloni.com>
Subject Re: PCA using Java Code
Date Wed, 03 Jul 2013 13:56:02 GMT
So how does the column mean get calculated if the --pcaOffset option is not
specified?  I would think you are just doing SVD at that point.


On Tue, Jul 2, 2013 at 5:52 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> On Tue, Jul 2, 2013 at 1:52 PM, Chirag Lakhani <clakhani@zaloni.com>
> wrote:
>
> > Hello,
> >
> > I am trying to use the Mahout/Java API to do PCA but I am confused about
> > the write order to do things.  To start, I have a list of DenseVectors
> that
> > I am reading into the code and turning it into a distributed matrix in
> the
> > following form.
> >
> >  DistributedRowMatrix m = new DistributedRowMatrix(input_vec,
> matrix_path,
> > num_rows,num_cols);
> >
> > When I run this code, I would have thought it would output the result
> into
> > the path called "matrix_path" so that I can then use something like
> > MatrixColumnMeansJob.run
> > to get mean. When I run this bit of code I get no output, is there
> > something else I should do or is there a better way to calculate the mean
> > for my file.
> >
> >
> > From what I understand about the SSVD CI code, you need to calculate the
> > column mean and then output it into a directory
>
> .
>
>
> No, you don't have to (although you have an _option_ to calculate and
> substitute one yourself if for some reason it is already known.) Default
> use assumes it would calculate it for you.
>
>
>
> > Is there a good way to do
> > this if I am starting from a file which is a sequence file of
> DenseVectors?
> >
>
> Yes. just don't specify --pcaOffset option.
>
>
> >
> > --
> >
> > *Chirag Lakhani*
> >
> > Data Scientist
> >
> > Zaloni, Inc. | www.zaloni.com
> >
> > 633 Davis Dr., Suite 200
> >
> > Durham, NC 27713
> > e: clakhani@zaloni.com
> > p: 919.602.4965 x7020
> >
>



-- 

*Chirag Lakhani*

Data Scientist

Zaloni, Inc. | www.zaloni.com

633 Davis Dr., Suite 200

Durham, NC 27713
e: clakhani@zaloni.com
p: 919.602.4965 x7020

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message