mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Seale <jonat...@samegrain.com>
Subject RowSimilarity
Date Wed, 08 Apr 2015 19:03:56 GMT
Hi all,
 
I'm new to the community and Mahout. Happy to be here. :-)
 
I have the following problem that I'm having difficulty with. I've setup an instance on Amazon
with Mahout and can run some basic machine learning tasks (just testing). Now I'm trying to
do a specific task and am unsure how to proceed.
 
Imagine I have a data file containing the following columns: user_id, item_id, and rating,
where rating is how each user rated the item on a scale of -1 to 1 (the necessity of negative
ratings will become apparent in a minute). Ultimately, what I'm trying to do is create a similarity
matrix that measures the similarity between all pairs of USERS. To do this, I would like to
transform the users' ratings into a matrix (rows are users, columns are items) and then run
RowSimilarity to find the dot product / cosine between all rows.
 
I feel like my problem is simple and has probably been done 1000 times, but I can't seem to
find any documentation directly on the subject. The best I've been able to do so far is use
the similaritem function (where I've swapped item for user). While it works and gives decent
results, it's mathematically not quite what I want. Help!
 
Thanks!
Jonathan




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message