mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefanie Nagel <na...@epoq.de>
Subject HebbianSolver - Generalized Hebbian Algorithm
Date Thu, 13 Jan 2011 11:34:51 GMT
Dear Mahout developers, 

I would like to use the 
org.apache.mahout.math.decomposer.hebbian.HebbianSolver in order to get a 
singular value decomposition (mahout 0.4).

In the paper "Generalized Hebbian Algorithm for Latent Semantic Analysis" (at 
http://www.dcs.shef.ac.uk/~genevieve/gorrell_webb.pdf) the algorithm is 
described. In the documentation of the HebbianSolver it is described as an 
iterative, sparse, singular value decomposition solver.
As far as I understood the HebbianSolver is designed for eigen-decomposition, 
but like any such algorithm, getting singular vectors out of it is immediate 
(since SVD and eigenvalue decomposition are related to each other).

Applied process:
- solverHebbian.solve(matrix, desired_rank) -> result: 
state.getCurrentEigens() (rows are eigenvectors!) and 
state.getCurrentEigenValues()
- get singular values: singularValue = Math.sqrt(eigenValue) -> generate sigma 
and sigmaInverse
- U = state.getCurrentEigens().transpose()
- V = (a.transpose().times(U)).times(sigmaInverse)
- approximated_a = (U.times(sigma)).times(V.transpose())
- calculate frobenius norm of a and approximated_a (in order to see how good 
the approximation is)

I tested the HebbianSolver with several examples.
And my question is: which matrix is supposed to be the input matrix for the 
HebbianSolver: A or A*A^t? Since the result are eigenvectors and eigenvalues I 
would expect A*A^t.
I compared the results for the examples (derived with the HebbianSolver) with 
the results derived with R (package corpcor).

1) If A has to be the input matrix ...
... example 1: the process above gets a result which is very similar to the 
result derived by R (U, Sigma and V).
... example 2: the process above gets a result which is at least similar to 
the result derived by R (U, Sigma and V).
... example 3: A would lead to an infinite loop, thus take A^t. But then: 
CardinalityException while calculating V.

2) If A*A^t has to be the input matrix ...
... example 1: U derived with the process above is the the same U as derived 
with R. The singular values derived with the process above are the squares 
from the singular values derived by R. V isn't equal.
... example 2: similar to results for example 1
... example 3: similar to results for example 1

Annotation: 
* The method getRandomStartingIndex(Matrix corpus, Matrix eigens) produces an 
infinite loop when every row of the considered matrix has less than 5 values 
(see at 1) example 3).
* I could post my Java Code as well if that would help.

Thanks in advance.

Stefanie



Here is the R code with the mentioned examples:
---------------------------------------------------------------------------------------------------
library(corpcor)

###########
# example 1
###########

a<-matrix(c(4.42282138, 1.51744077, 0.07690571, 0.93650042, 2.19609401, 
1.51744077, 1.73849477, -0.11856149, 0.76555191, 1.3673608, 
0.07690571, -0.11856149, 0.55065932, 1.72163263, -0.2283693, 
0.93650042, 0.76555191, 1.72163263, 0.09470345, -1.16626194, 
2.19609401, 1.3673608, -0.2283693, -1.16626194, -0.37321311 ), 5, 5, byrow = 
TRUE)
s = fast.svd(a)

D <- diag(s$d)
app_a <- s$u %*% D %*% t(s$v)
app_d <- t(s$u) %*% a %*% s$v
s

###########
# example 2
###########
a<-matrix(c(22, 52, 1, 93, 70, 33,
5, 83, 38, 85, 91, 63, 
68, 3, 7, 53, 76, 76, 
68, 5, 42, 9, 26, 99, 
93, 53, 69, 65, 5, 37, 
38, 67, 59, 42, 74, 25), 6, 6, byrow = TRUE)
s = fast.svd(a)

D <- diag(s$d)
app_a <- s$u %*% D %*% t(s$v)
app_d <- t(s$u) %*% a %*% s$v
s

###########
# example 3
###########
a<-matrix(c(4,4,5,
            4,5,5,
            3,3,2,
            4,5,4,
            4,4,4,
            3,5,4,
            4,4,3,
            2,4,4,
            5,5,5), 9, 3, byrow = TRUE)
s = fast.svd(a)

D <- diag(s$d)
app_a <- s$u %*% D %*% t(s$v)
app_d <- t(s$u) %*% a %*% s$v
s
---------------------------------------------------------------------------------------------------
-- 

Mime
View raw message