mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koobas <koo...@gmail.com>
Subject Re: Consistent repeatable results for distributed ALS-WR recommender
Date Mon, 24 Jun 2013 20:07:31 GMT
I am guessing (comments welcome) that it is going to be difficult
to guarantee reproducibility under parallel execution conditions.
MapReduce has reduction in its name.
Reduction operations are the main cause of irreproducibility in parallel
codes,
because changing the order of summations changes the impact of roundoff
errors.


On Mon, Jun 24, 2013 at 3:46 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> See org.apache.mahout.common.RandomUtils#useTestSeed
>
> It provides the ability to freeze the initial seed.  Normally this is only
> used during testing, but you could use it.
>
>
> On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <kazmikh@hotmail.com
> >wrote:
>
> > Thanks a lot!
> > Do you know by any chance what are the underlying reasons for including
> > such mandatory random seed initialization?
> > Do you see any sense in providing another option, such as filling them
> > with zeroes in order to ensure the consistency and repeatability? (for
> > example we might want to track and compare the generated recommendation
> > lists for different parameters, such as the number of features or number
> of
> > iterations etc.)
> > M.
> >
> >
> > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > recommender
> > > From: ssc@apache.org
> > > To: user@mahout.apache.org
> > >
> > > The matrices of the factorization are initalized randomly. If you fix
> the
> > > random seed (would require modification of the code) you should get
> > exactly
> > > the same results.
> > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" <kazmikh@hotmail.com>:
> > >
> > > > Hi!
> > > > Should I assume that under same dataset and same parameters for
> > factorizer
> > > > and recommender I will get the same results for any specific user?
> > > > My current understanding that theoretically ALS-WR algorithm could
> > > > guarantee this, but I was wondering could be there any numeric method
> > > > issues and/or implementation-specific concerns.
> > > > Would appreciate any highlight on this issue.
> > > > Mike.
> > > >
> > > >
> > > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message