mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Suglia <alessandro.sug...@yahoo.com>
Subject Re: Mahout recommendation in implicit feedback situation
Date Mon, 05 May 2014 16:26:08 GMT



-------- Original Message --------
Subject: 	Re: Mahout recommendation in implicit feedback situation
Date: 	Mon, 05 May 2014 18:25:00 +0200
From: 	Alessandro Suglia <alessandro.suglia@yahoo.com>
To: 	Ted Dunning <ted.dunning@gmail.com>



The standard movielens-100k has five split, each divided in training and 
test set.
In my case, for each split I train the recommender on the training set 
and I've tried to test it on the test set.
The test phase is conducted using an external tool, but in order to run 
it correctly I need to predict for each user a specific number of items 
(it is a top-n recommendation task) that are present in the test set and 
then rank them according to the predicted value for each one.

Now my problem is that there aren't any way in Mahout (apparently from 
my experience) to get a value between 0 and 1 for a specific item.
Is it true?
On 05/05/14 01:36, Ted Dunning wrote:
>
> I would second all of what Pat said.
>
> I would add that off-line evaluation of recommenders is pretty tricky 
> because, in practice, recommenders generate their own training data. 
>  This means that off-line evaluations or even performance on the first 
> day is not the entire story.
>
>
>
> On Sun, May 4, 2014 at 6:16 PM, Pat Ferrel <pat.ferrel@gmail.com 
> <mailto:pat.ferrel@gmail.com>> wrote:
>
>     First, are you doing an offline precision test? With training set
>     and probe or test set?
>
>     You can remove some data from the dataset. So remove certain
>     preferences. Then train and obtain recommendations for the user’s
>     who had some data withheld. The test data has not been used to
>     train and get recs so you then compare what users’ actually
>     preferred to the prediction made by the recommender. If all of
>     them match you have 100% precision. Note that you are comparing
>     recommendations to actual but held-out preferences
>
>     If you are using some special tools you may be doing this to
>     compare algorithms, which is not an exact thing at all, no matter
>     what the Netflix prize may have led us to believe. If you are
>     using offline tests to tune a specific recommender you may have
>     better luck with the results.
>
>     In one installation we had real data and split it into test and
>     training by date. 90% of older data was used to train, the most
>     recent 10% was used to test. This would mimic the way data comes
>     in. We compared the recommendations from the training data against
>     the actual preferences in the help-out data and used
>     MAP@some-number-of-recs as the score. This allows you to measure
>     ranking, where RMSE does not. The Map score led us to several
>     useful conclusions about tuning that were data dependent.
>
>     http://en.wikipedia.org/wiki/Information_retrieval#Mean_average_precision
>
>     On May 4, 2014, at 12:17 AM, Alessandro Suglia
>     <alessandro.suglia@yahoo.com <mailto:alessandro.suglia@yahoo.com>>
>     wrote:
>
>     Unfortunately it is not what I need because I'm using a
>     supplementary tool in order to compute the metrics, so I simply
>     need to produce a list of recommendation according to some
>     estimated preference that I have to compute for a specific user
>     and for specific items (the items in the test set).
>
>     How does it possible that Mahout doesn't grant this possibility?
>     Am I doing something wrong?
>     On 05/04/14 01:20, Pat Ferrel wrote:
>     > Are you doing this as an offline performance test? There is a
>     test framework for the in-memory recommenders (non-hadoop) that
>     will hold out random preferences and then use the held out ones to
>     preform various quality metrics. Is this what you need?
>     >
>     > See this wiki page under Evaluation
>     https://mahout.apache.org/users/recommender/userbased-5-minutes.html
>     >  On May 3, 2014, at 3:46 PM, Alessandro Suglia
>     <alessandro.suglia@yahoo.com <mailto:alessandro.suglia@yahoo.com>>
>     wrote:
>     >
>     > This is the procedure that I've adopted in the first moment
>     (incorrectly).
>     > But what I need to do is to estimate the preference for items
>     that aren't in the training set. In particular I'm working with
>     the movielens 10k's so for each split I should train my
>     recommender on the training set and test them (using some
>     classification metrics) on the test set.
>     > I'm not using the default mahout's evalutator so I need to
>     predict the preference and after that put all the results in a
>     specific file.
>     > Can you make an example in which I can appropriately follow this
>     way?
>     >
>     > Thank you in advance.
>     > Alessandro Suglia
>     >
>     > Il 03/mag/2014 23:06 Pat Ferrel <pat.ferrel@gmail.com
>     <mailto:pat.ferrel@gmail.com>> ha scritto:
>     >> Actually the regular cooccurrence recommender should work too.
>     Your example on Stackoverflow is calling the wrong method to get
>     recs, call .recommend(uersId) to get an ordered list of ids with
>     strengths.
>     >>
>     >> It looks to me like you are getting preference data from the
>     user, which in this case is 1 or 0—not recommendations.
>     >>
>     >> On May 3, 2014, at 7:42 AM, Sebastian Schelter <ssc@apache.org
>     <mailto:ssc@apache.org>> wrote:
>     >>
>     >> You should try the
>     >>
>     >>
>     org.apache.mahout.cf.taste.impl.recommender.GenericBooleanPrefUserBasedRecommender
>     >>
>     >> which has been built to handle such data.
>     >>
>     >> Best,
>     >> Sebastian
>     >>
>     >>
>     >> On 05/03/2014 04:34 PM, Alessandro Suglia wrote:
>     >>> I have described it in the SO's post:
>     >>> "When I execute this code, the result is a list of 0.0 or 1.0
>     which are
>     >>> not useful in the context of top-n recommendation in implicit
>     feedback
>     >>> context. Simply because I have to obtain, for each item, an
>     estimated
>     >>> rate which stays in the range [0, 1] in order to rank the list in
>     >>> decreasing order and construct the top-n recommendation
>     appropriately."
>     >>> On 05/03/14 16:25, Sebastian Schelter wrote:
>     >>>> Hi Allessandro,
>     >>>>
>     >>>> what result do you expect and what do you get? Can you give a
>     concrete
>     >>>> example?
>     >>>>
>     >>>> --sebastian
>     >>>>
>     >>>> On 05/03/2014 12:11 PM, Alessandro Suglia wrote:
>     >>>>> Good morning,
>     >>>>> I've tried to create a recommender system using Mahout in an
>     implicit
>     >>>>> feedback situation. What I'm trying to do is explained
>     exactlly in this
>     >>>>> post on stack overflow:
>     >>>>>
>     http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation.
>     >>>>>
>     >>>>>
>     <http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation>
>     >>>>>
>     >>>>>
>     >>>>> As you can see, I'm having some problem with it simply
>     because I cannot
>     >>>>> get the result that I expect (a value between 0 and 1) when
>     I try to
>     >>>>> predict a score for a specific item.
>     >>>>>
>     >>>>> Someone here can help me, please?
>     >>>>>
>     >>>>> Thank you in advance.
>     >>>>>
>     >>>>> Alessandro Suglia
>     >>>>>
>     >>
>
>
>




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message