mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: how to implement item-based recommender on movie genre data?
Date Thu, 10 May 2012 07:51:36 GMT
If you just need a similarity metric, you don't need a recommender -- of
which similarity is just a part. If the movie is 'user' and genre is 'item'
then you just use a UserSimilarity implementation to figure the similarity
between any two movies. You don't need anything more than that.

On Thu, May 10, 2012 at 7:29 AM, Daniel Quach <danquach@cs.ucla.edu> wrote:

> Well, actually, I wanted to represent each movie with a vector
>
> [1, 0, 0, 1, 0]
>
> Where each column represents an explicit genre, a 1 indicating that the
> movie has that genre while a 0 indicates it is not (a crude representation,
> I'm sure)
>
> I wanted to implement an item based recommender that uses these vectors to
> compute similarity between items.
>
> I think I figured it out, I could represent vector data as preferences
> where instead of user ID's, it would be column indices. Then load that into
> a DataModel for use with the ItemSimilarity object. The
> ItemBasedRecommender could load the DataModel with userID's while using
> this ItemSimilarity object for calculating similarities.
>
> This could possibly be a poor choice from an efficiency, accuracy, and
> machine learning standpoint, I am not an expert on the subject at all.
>
> On May 8, 2012, at 12:58 AM, Sean Owen wrote:
>
> > So you have already decided, for each movie, whether it's in or not in
> each
> > genre? And then you want to create a "profile" -- assuming you mean some
> > kind of meta-genre?
> >
> > This isn't a recommender problem; it's just a clustering problem. I'd use
> > the Tanimoto similarity.
> > You could run the clustering-based recommender just to build the
> clusters.
> > You wouldn't use it for recommendations.
> >
> > On Tue, May 8, 2012 at 8:53 AM, Daniel Quach <danquach@cs.ucla.edu>
> wrote:
> >
> >> Suppose that I want to give each movie a profile based on the genres
> each
> >> contains.
> >>
> >> For naive and simplistic purposes, let's pretend that each movie has a
> >> vector where each column is a genre, a 1 in that column indicates that
> the
> >> movie contains that genre, 0 otherwise.
> >>
> >> How would I feed such data into an Item-based Recommender? I want this
> >> recommender to use these vectors for calculating similarity for
> >> recommendations, which in turn is used for preference estimation (just
> as
> >> described in section 4.4.1 of the Mahout in Action book)
> >>
> >> The example in the book is not immediately clear to me. The sample code
> >> does not mention the format of the data being used in creating the
> >> ItemSimilarity object.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message