It sounds like you're describing the explicit case, or any matrix
decomposition. Are you sure that's best for countlike data? "It
depends," but my experience is that the implicit formulation is
better. In a way, the difference between 10,000 and 1,000 count is
less significant than the difference between 1 and 10. However if your
loss function penalizes the square of the error, then the former case
not only matters more for the same relative error, it matters 10x more
than the latter. It's very heavily skewed to pay attention to the
highcount instances.
On Sun, Jul 26, 2015 at 9:19 AM, Debasish Das <debasish.das83@gmail.com> wrote:
> Yeah, I think the idea of confidence is a bit different than what I am
> looking for using implicit factorization to do document clustering.
>
> I basically need (r_ij  w_ih_j)^2 for all observed ratings and (0 
> w_ih_j)^2 for all the unobserved ratings...Think about the document x word
> matrix where r_ij is the count that's observed, 0 are the word counts that
> are not in particular document.
>
> The broadcasted value of gram matrix w_i'wi or h_j'h_j will also count the
> r_ij those are observed...So I might be fine using the broadcasted gram
> matrix and use the linear term as \sum (r_ijw_i) or \sum (rijh_j)...
>
> I will think further but in the current implicit formulation with
> confidence, looks like I am really factorizing a 0/1 matrix with weights 1 +
> alpha*rating for . It's a bit different from LSA model.
>
>
>
>
>
> On Sun, Jul 26, 2015 at 12:34 AM, Sean Owen <sowen@cloudera.com> wrote:
>>
>> confidence = 1 + alpha * rating here (so, c1 means confidence  1),
>> so alpha = 1 doesn't specially mean high confidence. The loss function
>> is computed over the whole input matrix, including all missing "0"
>> entries. These have a minimal confidence of 1 according to this
>> formula. alpha controls how much more confident you are in what the
>> entries that do exist in the input mean. So alpha = 1 is lowish and
>> means you don't think the existence of ratings means a lot more than
>> their absence.
>>
>> I think the explicit case is similar, but not identical  here. The
>> cost function for the explicit case is not the same, which is the more
>> substantial difference between the two. There, ratings aren't inputs
>> to a confidence value that becomes a weight in the loss function,
>> during this factorization of a 0/1 matrix. Instead the rating matrix
>> is the thing being factorized directly.
>>
>> On Sun, Jul 26, 2015 at 6:45 AM, Debasish Das <debasish.das83@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Implicit factorization is important for us since it drives
>> > recommendation
>> > when modeling user click/noclick and also topic modeling to handle 0
>> > counts
>> > in document x word matrices through NMF and Sparse Coding.
>> >
>> > I am a bit confused on this code:
>> >
>> > val c1 = alpha * math.abs(rating)
>> > if (rating > 0) ls.add(srcFactor, (c1 + 1.0)/c1, c1)
>> >
>> > When the alpha = 1.0 (high confidence) and rating is > 0 (true for word
>> > counts), why this formula does not become same as explicit formula:
>> >
>> > ls.add(srcFactor, rating, 1.0)
>> >
>> > For modeling document, I believe implicit Y'Y needs to stay but we need
>> > explicit ls.add(srcFactor, rating, 1.0)
>> >
>> > I am understanding confidence code further. Please let me know if the
>> > idea
>> > of mapping implicit to handle 0 counts in document word matrix makes
>> > sense.
>> >
>> > Thanks.
>> > Deb
>> >
>
>

To unsubscribe, email: devunsubscribe@spark.apache.org
For additional commands, email: devhelp@spark.apache.org
