mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Floris Devriendt <>
Subject "Binary" Data
Date Thu, 15 May 2014 15:53:51 GMT
Hello everybody,

I'm a new Mahout user and I was hoping to some people could point me in the
right direction.

My data consists of exercise results made by different users and I want to
recommend different exercises to different users using the collaborative
filtering techniques available in Mahout. The idea is that the 'items' in
my data consists of the exercises and the relations between users and items
can take up three values:

   - A user has correctly completed the exercise.
   - A user has incorrectly completed the exercise.
   - A user has not made an attempt at the exercise.

In essence this data can be compared to like/dislike/unknown type of data.

Now I know more or less how to build a recommender in Mahout but I'm having
some difficulties in designing it. A lot depends on the similarity measure
used, but most similarity measures take into account a rating style of
preferences (e.g. when rating movies or music). The exceptions, if I
interpret it correctly, are the Tanimoto Coefficient and the log likelihood
Similarity. But those similarities seem to focus on boolean data where a
user either has a relation with an item or there doesn't exist one.

What are the key aspects to keep into account when working with this kind
of data (with three distinct values)? Does it all depend on my similarity
measure used? Or are there other aspects I need to take into account to
make the recommendations worthwhile for this kind of data?

I also have some more questions on some of the similarity measures
implemented in Mahout, but I don't want to ask too much at once. If
somebody can guide me in the right direction with the above questions, then
this would be appreciated.

Kind regards,
Floris Devriendt

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message