mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mario.al...@gmail.com
Subject Re: Discrete Rating Scale
Date Mon, 14 Jul 2014 15:52:56 GMT
If you are using the
distributed org.apache.mahout.cf.taste.hadoop.item.RecommenderJob you
should never use "0" . If you do that, when you multiply the co-occurence
matrix times the user's rating vector you remove elements in the matrix,
which is like if the user never interacted with the item.

For the same reason, "-1" should work, because actually subtract score from
any book which similar to the one with negative rating.

For CosineSimilarity, 0 has to be avoided for obvious reasons (no cosine
defined at the origin of the axis), and 1 and 2 are possibly the values I'd
go for.

Tanimoto and LogLikelihood are True/False, but False means "not
interacted". Having "dislike = False" would be extremely misleading.

For all the other algorithms, I'd say one should make similar
considerations.

Cheers
Mario


On Mon, Jul 14, 2014 at 4:21 PM, Floris Devriendt <florisdevriendt@gmail.com
> wrote:

> Hey all,
>
> When using a discrete rating scale (e.g. likes / dislikes), what are the
> things that I should consider when using Mahout for Collaborative
> Filtering?
>
> If I'm not mistaking I've read a mail a week or two ago from this mailing
> list stating that one should avoid using 0 (dislike) and 1 (like) as
> scores, because Mahout would not be able to take into account the dislikes
> properly.
> If this is true, what scores should I give to my like/dislike scale? (e.g.
> is -1/1 better than 0/1, or should I use 1/2 with 1 = dislike and 2 =
> like?)
>
> Best regards,
> Floris Devriendt
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message