On 22 Sep 2011, at 18:37, Markus Holtermann <info@markusholtermann.eu> wrote:
> Hello there,
>
> I'm trying to run Mahout's Singular Value Decomposition but realized,
> that the resulting eigenvalues are wrong in most cases. So I took two
> small 3x3 matrices and calculated their eigenvalues and eigenvectors by
> hand and compared the results to Mahout.
>
> Only in one of eight cases the results for Mahout and my pen & paper
> matched.
>
> Lets take
> A = {{1,2,3},{2,4,5},{3,5,6}}
> and
> B = {{5,2,4},{3,6,2},{3,3,1}}
>
> As you can see, A is symmetric, B is not.
>
> I ran `mahout svd output out/ numRows 3 numCols 3` eight times
> with different arguments:
>
> 1) input A rank 3 symmetric true result is wrong
> 2) input A rank 4 symmetric true result is wrong
> 3) input A rank 3 symmetric false result is wrong
> 4) input A rank 4 symmetric false result is CORRECT
>
> 5) input B rank 3 symmetric true result is wrong
> 6) input B rank 4 symmetric true result is wrong
> 7) input B rank 3 symmetric false result is wrong
> 8) input B rank 4 symmetric false result is wrong
>
> To verify that my input data is correct, this is the result of `mahout
> seqdumper`
>
> For A:
> Key class: class org.apache.hadoop.io.IntWritable
> Value Class: class org.apache.mahout.math.VectorWritable
> Key: 0: Value: {0:1.0,1:2.0,2:3.0}
> Key: 1: Value: {0:2.0,1:4.0,2:5.0}
> Key: 2: Value: {0:3.0,1:5.0,2:6.0}
> Count: 3
>
>
> For B:
> Key class: class org.apache.hadoop.io.IntWritable
> Value Class: class org.apache.mahout.math.VectorWritable
> Key: 0: Value: {0:5.0,1:2.0,2:4.0}
> Key: 1: Value: {0:3.0,1:6.0,2:2.0}
> Key: 2: Value: {0:3.0,1:3.0,2:1.0}
> Count: 3
>
>
> And finally, the correct eigenvalues should be:
> For A:
> λ1 = 11.3448
> λ2 = 0.515729
> λ3 = 0.170915
>
> For B:
> λ1 = 7
> λ2 = 3
> λ3 = 2
>
> So, are there any known bugs in Mahout's SVD implementation? Am I doing
> something wrong? Is this algorithm known to produce wrong results?
>
> Thanks in advance.
>
I have the impression from somewhere that there is a problem with sending tiny matrices to
mahout lanczos/svd. Something like  it doesn't then get enough iterations to settle on decent
values. Sorry I can't find a ref/link for this; hope I didn't dream it...
Dan
> Markus
