confidence = 1 + alpha * rating here (so, c1 means confidence  1),
so alpha = 1 doesn't specially mean high confidence. The loss function
is computed over the whole input matrix, including all missing "0"
entries. These have a minimal confidence of 1 according to this
formula. alpha controls how much more confident you are in what the
entries that do exist in the input mean. So alpha = 1 is lowish and
means you don't think the existence of ratings means a lot more than
their absence.
I think the explicit case is similar, but not identical  here. The
cost function for the explicit case is not the same, which is the more
substantial difference between the two. There, ratings aren't inputs
to a confidence value that becomes a weight in the loss function,
during this factorization of a 0/1 matrix. Instead the rating matrix
is the thing being factorized directly.
On Sun, Jul 26, 2015 at 6:45 AM, Debasish Das <debasish.das83@gmail.com> wrote:
> Hi,
>
> Implicit factorization is important for us since it drives recommendation
> when modeling user click/noclick and also topic modeling to handle 0 counts
> in document x word matrices through NMF and Sparse Coding.
>
> I am a bit confused on this code:
>
> val c1 = alpha * math.abs(rating)
> if (rating > 0) ls.add(srcFactor, (c1 + 1.0)/c1, c1)
>
> When the alpha = 1.0 (high confidence) and rating is > 0 (true for word
> counts), why this formula does not become same as explicit formula:
>
> ls.add(srcFactor, rating, 1.0)
>
> For modeling document, I believe implicit Y'Y needs to stay but we need
> explicit ls.add(srcFactor, rating, 1.0)
>
> I am understanding confidence code further. Please let me know if the idea
> of mapping implicit to handle 0 counts in document word matrix makes sense.
>
> Thanks.
> Deb
>

To unsubscribe, email: devunsubscribe@spark.apache.org
For additional commands, email: devhelp@spark.apache.org
