spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: implicit ALS dataSet
Date Thu, 05 Jun 2014 21:47:57 GMT
On Thu, Jun 5, 2014 at 10:38 PM, redocpot <julien19890118@gmail.com> wrote:
> can be simplified by taking advantage of its algebraic structure, so
> negative observations are not needed. This is what I think at the first time
> I read the paper.

Correct, a big part of the reason that is efficient is because of
sparsity of the input.

> What makes me confused is, after that, the paper (in Discussion section)
> says
>
> "Unlike explicit datasets, here *the model should take all user-item
> preferences as an input, including those which are not related to any input

It is not saying that these non-observations (I would not call them
negative) should explicitly appear in the input. But their implicit
existence can and should be used in the math.

In particular, the loss function that is being minimized is minimizing
error in the implicit "0" cells of the input too, just with much less
weight.

Mime
View raw message