mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: ALS, weighed vs. non-weighed regularization paper
Date Mon, 16 Jun 2014 20:28:56 GMT
Yeah I've turned that over in my head. I am not sure I have a great
answer. But I interpret the net effect to be that the model prefers
simple explanations for active users, at the cost of more error in the
approximation. One would rather pick a basis that more naturally
explains the data observed in active users. I think I can see that
this could be a useful assumption -- these users are less extremely

On Mon, Jun 16, 2014 at 8:50 PM, Dmitriy Lyubimov <> wrote:
> Probably a question for Sebastian.
> As we know, the two papers (Hu-Koren-Volynsky and Zhou et. al) use slightly
> different loss functions.
> Zhou et al. are fairly unique in that they multiply norm of U, V vectors
> additionally by the number of observied interactions.
> The paper doesn't explain why it works except saying along the lines of "we
> tried several regularization matrices, and this one worked better in our
> case".
> I tried to figure why that is. And still not sure why it would be better.
> So b asically we say, by allowing smaller sets of observation having
> smaller regularization values, it is ok for smaller observation sets to
> overfit slightly more than larger observations sets.
> This seems to be counterintuitive. Intuition tells us, smaller sets
> actually would tend to overfit more, not less, and therefore might possibly
> use larger regularization rate, not smaller one. Sebastian, what's your
> take on weighing regularization in ALS-WR?
> thanks.
> -d

View raw message