mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <prasen....@gmail.com>
Subject Fwd: problem Interpreting SVD values
Date Sun, 18 Oct 2009 07:46:52 GMT
Apologies, as I know the question is actually for lingpipe, but was
hoping if I could get some response from mahout users as well ( who
has probably worked with  lingpipe )


---------- Forwarded message ----------
From: prasenjit mukherjee <prasen.bea@gmail.com>
Date: Sun, Oct 18, 2009 at 12:39 PM
Subject: problem Interpreting SVD values
To: lingpipe <lingpipe@yahoogroups.com>


I am trying to evaluate  partialSvd() on a smaller matrix and this is
what my findings are. Below is my input matrix, assuming 4 terms and 3
docs.

doc0 => (2,t0) (2,t1)
doc1 => (2,t0) (2,t1)
doc2 => (2,t2) (2,t3)

As one can see docs d0,d1 are exactly same containing 4 terms  with 2
from t0,t1 each.  3rd doc is different containing 4 terms with 2 from
t2,t3 each. Below is their matrix representation  ( in TXD form ) :

0,0,2
0,1,2
1,0,2
1,1,2
2,2,2
2,3,2

I ran with maxOrder =2 and following input  params :
       double featureInit = 0.01;
       double initialLearningRate = 0.005;
       int annealingRate = 1000;
       double regularization = 0.00;
       double minImprovement = 0.0001;
       int minEpochs = 2;
       int maxEpochs = 100;//50000;
and was expecting to get d0,d1 in 1 cluster and d2 in another.
Contrary to my expectation I am getting the following output ( See U,V
values) :

    [java]       :00 Start
    [java]       :00   Factor=0
    [java]       :00     epoch=0 rmse=1.9999848100360043
    [java]       :00     epoch=1 rmse=1.9999835637692873
    [java]       :00     epoch=2 rmse=1.999982296871324
    [java]       :00 Converged in epoch=2 rmse=1.999982296871324
relDiff=3.167271940722782E-7
    [java]       :00 Order=0 RMSE=1.9999835637692873
    [java]       :00   Factor=1
    [java]       :00     epoch=0 rmse=1.9999522133829444
    [java]       :00     epoch=1 rmse=1.9999506819096369
    [java]       :00     epoch=2 rmse=1.99994912138043
    [java]       :00 Converged in epoch=2 rmse=1.99994912138043
relDiff=3.901420744799641E-7
    [java]       :00 Order=1 RMSE=1.9999506819096369
    [java] SVD Computation Done. Singular Values:
    [java]     2.796903874825226E-4  2.536844759290206E-4
    [java] Output U_Matrix: ./rundir/U_out.matrix
    [java] Output V_Matrix: ./rundir/V_out.matrix


And my U,V matrices are :
U:
0,0,-0.690807182791581
0,1,0.6535363126818338
1,0,0.053924014251416
1,1,-0.2055548955329534
2,0,-0.7210254065499858
2,1,0.7284486755624372

Shouldn't the coeffs of 0 and 1s be the same in U, because they refer
to d0 and d1  ?

V:
0,0,-0.7473523845369358
0,1,-0.14168050325102471
1,0,0.35114591804331297
1,1,0.6137947267695599
2,0,-0.4945242525093567
2,1,0.776371576839163
3,0,0.27130558646761577
3,1,0.02073265696164584

Mime
View raw message