mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: LSI using Mahout ssvd - folding a new doc into the space
Date Fri, 29 Jun 2012 21:31:40 GMT
Well the inverse of a diagonal matrix like that is just going to be a
diagonal matrix holding the reciprocals (1/x) of the values. That much
is easy. But you need to invert more than that to fold in.

I admit even I don't know the details of the Mahout implementation
you're using, but I imagine the overall principle is the same as the
fold-in described in ... oh wait, look at that, in a preso I posted a
while ago: http://www.slideshare.net/srowen/matrix-factorization  Look
at the last few slides; I think it's kind of a useful / simple way to
think of it.

Sean

On Fri, Jun 29, 2012 at 10:27 PM, Chris Hokamp <chris.hokamp@gmail.com> wrote:
> Hi all,
>
> I'm trying to implement Latent Semantic Indexing using the mahout ssvd
> tool, and I'm having trouble understanding how I can use the output of ssvd
> Mahout to 'fold' new queries (documents) into the LSI space. Specifically,
> I can't find a way to multiply a vector representing a query by the inverse
> of the matrix of singular values - I can't find a way to solve for the
> inverse of the diagonal matrix of singular values.
>
> I can generate the output matrices using ssvd, and compare document/term
> vectors using cosine similarity, but I'm stumped when it comes to folding a
> new document into the space.
>
> Any thoughts or guidance would be appreciated.
>
> Cheers,
> Chris

Mime
View raw message