mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koobas <koo...@gmail.com>
Subject Re: Consistent repeatable results for distributed ALS-WR recommender
Date Mon, 24 Jun 2013 22:46:07 GMT
Well, you know, the issue is there, whether we like it or not.
Maybe replication is enough, maybe not.
If there is a workshop on that issue, it's on the radar.
http://beamtenherrschaft.blogspot.com/2013/06/acm-recsys-2013-workshop-on.html


On Mon, Jun 24, 2013 at 6:36 PM, Sean Owen <srowen@gmail.com> wrote:

> Yeah this has gone well off-road.
>
> ALS is not non-deterministic because of hardware errors or cosmic
> rays. It's also nothing to do with floating-point round-off, or
> certainly, that is not the primary source of non-determinism to
> several orders of magnitude.
>
> ALS starts from a random solution and this will result in a different
> solution. The overall problem is non-convex and the process will not
> necessarily converge to the same solution.
>
> Randomness is a common feature of machine learning: centroid selection
> in k-means, the 'stochastic' in SGD, random forests, etc. I don't
> think the question is why randomness is useful right?
>
> For ALS... I don't quite understand the question, what's the
> alternative? certainly I have always seen it formulated in terms of a
> random initial solution. You don't want to always start from the same
> point because of local minima. Ideally you start from many points and
> take the best solution.
>
> On Mon, Jun 24, 2013 at 11:22 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > This is a common chestnut that gets trotted out commonly, but I doubt
> that
> > the effects that the OP was worried about where on the same scale.
> >  Non-commutativity of FP arithmetic on doubles rarely has a very large
> > effect.
> >
> >
> > On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin <kazmikh@hotmail.com
> >wrote:
> >
> >> Any algorithm is non-deterministic because of non-deterministic behavior
> >> of underlying hardware, of course :) But that's an offtop. I'm talking
> >> about specific implementation of specific algorithm, and in general I'd
> >> like to know that at least some very general properties of the algorithm
> >> implementation conserve (and why did authors added intentional
> >> non-deterministic component to implementation).
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message