mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: SSVD Wrong Singular Vectors
Date Sat, 01 Sep 2012 14:52:59 GMT
No its zero mean uniform of course. A murmur scaled to -1...1 range.

I used to use normal too but you advised there were not much difference and
i actually did not see much either.

I also think that in this case me moving the input to R via decimals
actually created precision errors too. I will double check. And my
synthetic test input has a flat tale in the lower singular numbers which of
course messes up some singular vectors in the tale but doesnt affect
singular values. I will check for these things and look again. But i dont
see a fundamental problems with the resuls i see, they are the same down to
eighth digit after the dot, so there is no fundamental problem here.
 On Sep 1, 2012 1:03 AM, "Ted Dunning" <ted.dunning@gmail.com> wrote:

> Oho...
>
> If the uniform randoms have non-zero means, then this could be a
> significant effect that leads to some loss of significance in the results.
>  For small matrices the resulting difference shouldn't be huge but it might
> well be observable.
>
> On Sat, Sep 1, 2012 at 3:45 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > sorry, i meant  "random trinary"
> >
> > On Sat, Sep 1, 2012 at 12:39 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > wrote:
> > > Hm. there is slight error between R full rank SVD and Mahout MR SSVD
> > > for my unit test modified for 100x100 k= 3 p=10.
> > >
> > > First left vector (R/SSVD) :
> > >> s$u[,1]
> > >   [1] -0.050741660 -0.083985411  0.078767108 -0.044487425 -0.010380367
> > >   [6]  0.069635451  0.158337400  0.029102044 -0.168156173 -0.127921554
> > >  [11]  0.012698809 -0.027140724  0.069357925 -0.015605283  0.076614201
> > >  [16] -0.158582188  0.143656275  0.033886221 -0.055111330 -0.029299261
> > >  [21]  0.059667350  0.039205405  0.042027376  0.048541162  0.158267382
> > >  [26] -0.045441433  0.044529295 -0.038681358 -0.024035611 -0.054543123
> > >  [31]  0.027365365 -0.054029635 -0.021845631  0.053124795  0.050475680
> > >  [36] -0.093776477  0.094699229 -0.030911885 -0.169810667  0.149075410
> > >  [41]  0.102150407  0.165651229  0.175798233 -0.048390507  0.175243690
> > >  [46] -0.170793896  0.059918820 -0.132466003 -0.131783388 -0.178422266
> > >  [51]  0.079304233 -0.054428953  0.057820900  0.120791505  0.095287617
> > >  [56]  0.036671894 -0.081203386  0.153768112  0.014849405  0.027470798
> > >  [61] -0.064944829 -0.007538214  0.069034637 -0.133978151 -0.022290433
> > >  [66] -0.038094067  0.168947231 -0.100797474 -0.054253041 -0.040255069
> > >  [71]  0.124817481 -0.059689202  0.018821181 -0.131237426 -0.141223359
> > >  [76]  0.128026731 -0.170388319  0.080445852  0.071966615 -0.029745918
> > >  [81]  0.049479520 -0.121362268 -0.077338205 -0.061950828 -0.168851635
> > >  [86] -0.073192796  0.087453086 -0.085166577  0.160026655 -0.060816556
> > >  [91]  0.015420973  0.117780809  0.083415819 -0.160806975  0.171932591
> > >  [96]  0.170064367  0.001479280 -0.161878123  0.129685305 -0.104231610
> > >> U[,1]
> > >            1            2            3            4            5
> >    6
> > >  0.050741634  0.083985464 -0.078767344  0.044487660  0.010380470
> > -0.069635561
> > >            7            8            9           10           11
> >   12
> > > -0.158337117 -0.029102012  0.168156073  0.127921760 -0.012698756
> >  0.027140487
> > >           13           14           15           16           17
> >   18
> > > -0.069358074  0.015605295 -0.076614050  0.158582091 -0.143656127
> > -0.033886485
> > >           19           20           21           22           23
> >   24
> > >  0.055111560  0.029299084 -0.059667201 -0.039205182 -0.042027356
> > -0.048541087
> > >           25           26           27           28           29
> >   30
> > > -0.158267335  0.045441521 -0.044529241  0.038681577  0.024035604
> >  0.054543106
> > >           31           32           33           34           35
> >   36
> > > -0.027365256  0.054029674  0.021845620 -0.053124833 -0.050475677
> >  0.093776656
> > >           37           38           39           40           41
> >   42
> > > -0.094699463  0.030911730  0.169810791 -0.149075076 -0.102150266
> > -0.165651017
> > >           43           44           45           46           47
> >   48
> > > -0.175798375  0.048390265 -0.175243708  0.170793758 -0.059918703
> >  0.132465938
> > >           49           50           51           52           53
> >   54
> > >  0.131783579  0.178422152 -0.079304282  0.054428751 -0.057820999
> > -0.120791565
> > >           55           56           57           58           59
> >   60
> > > -0.095287586 -0.036671995  0.081203324 -0.153767938 -0.014849361
> > -0.027471027
> > >           61           62           63           64           65
> >   66
> > >  0.064944979  0.007538413 -0.069034788  0.133978044  0.022290513
> >  0.038094051
> > >           67           68           69           70           71
> >   72
> > > -0.168947352  0.100797649  0.054253165  0.040255237 -0.124817480
> >  0.059689502
> > >           73           74           75           76           77
> >   78
> > > -0.018821295  0.131237429  0.141223597 -0.128027116  0.170388135
> > -0.080445760
> > >           79           80           81           82           83
> >   84
> > > -0.071966482  0.029745819 -0.049479559  0.121362303  0.077338278
> >  0.061950724
> > >           85           86           87           88           89
> >   90
> > >  0.168851648  0.073193002 -0.087453189  0.085166809 -0.160026464
> >  0.060816590
> > >           91           92           93           94           95
> >   96
> > > -0.015421147 -0.117780975 -0.083415727  0.160806958 -0.171932343
> > -0.170064514
> > >           97           98           99          100
> > > -0.001479434  0.161878089 -0.129685379  0.104231530
> > >
> > > Same thing for the right singular vectors. The only thing is that they
> > > seem to change the sign between R and Mahout's version but otherwise
> > > they fit more or less exactly.
> > >
> > > So yeah i am seeing some stochastic effects in these for k and p being
> > > so low -- so are you saying your errors are greater than those? I did
> > > not test sequential version with similar parameters.
> > >
> > > One significant difference between MR and sequential version is that
> > > sequential version is using ternary random matrix (instead of uniform
> > > one), perhaps that may affect accuracy a little bit.
> > >
> > > On Fri, Aug 31, 2012 at 10:55 PM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >> Can you provide your test code?
> > >>
> > >> What difference did you observe?
> > >>
> > >> Did you account for the fact that your matrix is small enough that it
> > >> probably wasn't divided correctly?
> > >>
> > >> On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary <aagohary@gmail.com>
> > wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I used mahout's stochastic svd implementation to find the singular
> > vectors
> > >>> and the singular vectors of a small matrix 99x100. Then, I compared
> the
> > >>> results to the singular values and the singular vectors obtained
> using
> > the
> > >>> svd function in matlab and the single threaded version of the ssvd.
I
> > got
> > >>> pretty much the same singular values using the 3 implementations.
> > however,
> > >>> the singular vectors of mahout's ssvd were significantly different.
I
> > tried
> > >>> multiple values for the parameters P and Q but, that does not seem
to
> > solve
> > >>> the problem. Does MR implementation of the SSVD do extra
> approximations
> > >>> over the single threaded ssvd so their results might not be the same?
> > Any
> > >>> advice how I can tune mahout's ssvd to get the same singular vectors
> > of the
> > >>> single threaded ssvd?
> > >>>
> > >>> thanks,
> > >>>
> > >>> --ahmed
> > >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message