mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: SSVD Wrong Singular Vectors
Date Sat, 01 Sep 2012 08:02:16 GMT
Oho...

If the uniform randoms have non-zero means, then this could be a
significant effect that leads to some loss of significance in the results.
 For small matrices the resulting difference shouldn't be huge but it might
well be observable.

On Sat, Sep 1, 2012 at 3:45 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> sorry, i meant  "random trinary"
>
> On Sat, Sep 1, 2012 at 12:39 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
> > Hm. there is slight error between R full rank SVD and Mahout MR SSVD
> > for my unit test modified for 100x100 k= 3 p=10.
> >
> > First left vector (R/SSVD) :
> >> s$u[,1]
> >   [1] -0.050741660 -0.083985411  0.078767108 -0.044487425 -0.010380367
> >   [6]  0.069635451  0.158337400  0.029102044 -0.168156173 -0.127921554
> >  [11]  0.012698809 -0.027140724  0.069357925 -0.015605283  0.076614201
> >  [16] -0.158582188  0.143656275  0.033886221 -0.055111330 -0.029299261
> >  [21]  0.059667350  0.039205405  0.042027376  0.048541162  0.158267382
> >  [26] -0.045441433  0.044529295 -0.038681358 -0.024035611 -0.054543123
> >  [31]  0.027365365 -0.054029635 -0.021845631  0.053124795  0.050475680
> >  [36] -0.093776477  0.094699229 -0.030911885 -0.169810667  0.149075410
> >  [41]  0.102150407  0.165651229  0.175798233 -0.048390507  0.175243690
> >  [46] -0.170793896  0.059918820 -0.132466003 -0.131783388 -0.178422266
> >  [51]  0.079304233 -0.054428953  0.057820900  0.120791505  0.095287617
> >  [56]  0.036671894 -0.081203386  0.153768112  0.014849405  0.027470798
> >  [61] -0.064944829 -0.007538214  0.069034637 -0.133978151 -0.022290433
> >  [66] -0.038094067  0.168947231 -0.100797474 -0.054253041 -0.040255069
> >  [71]  0.124817481 -0.059689202  0.018821181 -0.131237426 -0.141223359
> >  [76]  0.128026731 -0.170388319  0.080445852  0.071966615 -0.029745918
> >  [81]  0.049479520 -0.121362268 -0.077338205 -0.061950828 -0.168851635
> >  [86] -0.073192796  0.087453086 -0.085166577  0.160026655 -0.060816556
> >  [91]  0.015420973  0.117780809  0.083415819 -0.160806975  0.171932591
> >  [96]  0.170064367  0.001479280 -0.161878123  0.129685305 -0.104231610
> >> U[,1]
> >            1            2            3            4            5
>    6
> >  0.050741634  0.083985464 -0.078767344  0.044487660  0.010380470
> -0.069635561
> >            7            8            9           10           11
>   12
> > -0.158337117 -0.029102012  0.168156073  0.127921760 -0.012698756
>  0.027140487
> >           13           14           15           16           17
>   18
> > -0.069358074  0.015605295 -0.076614050  0.158582091 -0.143656127
> -0.033886485
> >           19           20           21           22           23
>   24
> >  0.055111560  0.029299084 -0.059667201 -0.039205182 -0.042027356
> -0.048541087
> >           25           26           27           28           29
>   30
> > -0.158267335  0.045441521 -0.044529241  0.038681577  0.024035604
>  0.054543106
> >           31           32           33           34           35
>   36
> > -0.027365256  0.054029674  0.021845620 -0.053124833 -0.050475677
>  0.093776656
> >           37           38           39           40           41
>   42
> > -0.094699463  0.030911730  0.169810791 -0.149075076 -0.102150266
> -0.165651017
> >           43           44           45           46           47
>   48
> > -0.175798375  0.048390265 -0.175243708  0.170793758 -0.059918703
>  0.132465938
> >           49           50           51           52           53
>   54
> >  0.131783579  0.178422152 -0.079304282  0.054428751 -0.057820999
> -0.120791565
> >           55           56           57           58           59
>   60
> > -0.095287586 -0.036671995  0.081203324 -0.153767938 -0.014849361
> -0.027471027
> >           61           62           63           64           65
>   66
> >  0.064944979  0.007538413 -0.069034788  0.133978044  0.022290513
>  0.038094051
> >           67           68           69           70           71
>   72
> > -0.168947352  0.100797649  0.054253165  0.040255237 -0.124817480
>  0.059689502
> >           73           74           75           76           77
>   78
> > -0.018821295  0.131237429  0.141223597 -0.128027116  0.170388135
> -0.080445760
> >           79           80           81           82           83
>   84
> > -0.071966482  0.029745819 -0.049479559  0.121362303  0.077338278
>  0.061950724
> >           85           86           87           88           89
>   90
> >  0.168851648  0.073193002 -0.087453189  0.085166809 -0.160026464
>  0.060816590
> >           91           92           93           94           95
>   96
> > -0.015421147 -0.117780975 -0.083415727  0.160806958 -0.171932343
> -0.170064514
> >           97           98           99          100
> > -0.001479434  0.161878089 -0.129685379  0.104231530
> >
> > Same thing for the right singular vectors. The only thing is that they
> > seem to change the sign between R and Mahout's version but otherwise
> > they fit more or less exactly.
> >
> > So yeah i am seeing some stochastic effects in these for k and p being
> > so low -- so are you saying your errors are greater than those? I did
> > not test sequential version with similar parameters.
> >
> > One significant difference between MR and sequential version is that
> > sequential version is using ternary random matrix (instead of uniform
> > one), perhaps that may affect accuracy a little bit.
> >
> > On Fri, Aug 31, 2012 at 10:55 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> >> Can you provide your test code?
> >>
> >> What difference did you observe?
> >>
> >> Did you account for the fact that your matrix is small enough that it
> >> probably wasn't divided correctly?
> >>
> >> On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary <aagohary@gmail.com>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> I used mahout's stochastic svd implementation to find the singular
> vectors
> >>> and the singular vectors of a small matrix 99x100. Then, I compared the
> >>> results to the singular values and the singular vectors obtained using
> the
> >>> svd function in matlab and the single threaded version of the ssvd. I
> got
> >>> pretty much the same singular values using the 3 implementations.
> however,
> >>> the singular vectors of mahout's ssvd were significantly different. I
> tried
> >>> multiple values for the parameters P and Q but, that does not seem to
> solve
> >>> the problem. Does MR implementation of the SSVD do extra approximations
> >>> over the single threaded ssvd so their results might not be the same?
> Any
> >>> advice how I can tune mahout's ssvd to get the same singular vectors
> of the
> >>> single threaded ssvd?
> >>>
> >>> thanks,
> >>>
> >>> --ahmed
> >>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message