mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Mahout recommendation in implicit feedback situation
Date Sun, 04 May 2014 23:36:05 GMT
I would second all of what Pat said.

I would add that off-line evaluation of recommenders is pretty tricky
because, in practice, recommenders generate their own training data.  This
means that off-line evaluations or even performance on the first day is not
the entire story.



On Sun, May 4, 2014 at 6:16 PM, Pat Ferrel <pat.ferrel@gmail.com> wrote:

> First, are you doing an offline precision test? With training set and
> probe or test set?
>
> You can remove some data from the dataset. So remove certain preferences.
> Then train and obtain recommendations for the user’s who had some data
> withheld. The test data has not been used to train and get recs so you then
> compare what users’ actually preferred to the prediction made by the
> recommender. If all of them match you have 100% precision. Note that you
> are comparing recommendations to actual but held-out preferences
>
> If you are using some special tools you may be doing this to compare
> algorithms, which is not an exact thing at all, no matter what the Netflix
> prize may have led us to believe. If you are using offline tests to tune a
> specific recommender you may have better luck with the results.
>
> In one installation we had real data and split it into test and training
> by date. 90% of older data was used to train, the most recent 10% was used
> to test. This would mimic the way data comes in. We compared the
> recommendations from the training data against the actual preferences in
> the help-out data and used MAP@some-number-of-recs as the score. This
> allows you to measure ranking, where RMSE does not. The Map score led us to
> several useful conclusions about tuning that were data dependent.
>
> http://en.wikipedia.org/wiki/Information_retrieval#Mean_average_precision
>
> On May 4, 2014, at 12:17 AM, Alessandro Suglia <
> alessandro.suglia@yahoo.com> wrote:
>
> Unfortunately it is not what I need because I'm using a supplementary tool
> in order to compute the metrics, so I simply need to produce a list of
> recommendation according to some estimated preference that I have to
> compute for a specific user and for specific items (the items in the test
> set).
>
> How does it possible that Mahout doesn't grant this possibility? Am I
> doing something wrong?
> On 05/04/14 01:20, Pat Ferrel wrote:
> > Are you doing this as an offline performance test? There is a test
> framework for the in-memory recommenders (non-hadoop) that will hold out
> random preferences and then use the held out ones to preform various
> quality metrics. Is this what you need?
> >
> > See this wiki page under Evaluation
> https://mahout.apache.org/users/recommender/userbased-5-minutes.html
> >  On May 3, 2014, at 3:46 PM, Alessandro Suglia <
> alessandro.suglia@yahoo.com> wrote:
> >
> > This is the procedure that I've adopted in the first moment
> (incorrectly).
> > But what I need to do is to estimate the preference for items that
> aren't in the training set. In particular I'm working with the movielens
> 10k's so for each split I should train my recommender on the training set
> and test them (using some classification metrics) on the test set.
> > I'm not using the default mahout's evalutator so I need to predict the
> preference and after that put all the results in a specific file.
> > Can you make an example in which I can appropriately follow this way?
> >
> > Thank you in advance.
> > Alessandro Suglia
> >
> > Il 03/mag/2014 23:06 Pat Ferrel <pat.ferrel@gmail.com> ha scritto:
> >> Actually the regular cooccurrence recommender should work too. Your
> example on Stackoverflow is calling the wrong method to get recs, call
> .recommend(uersId) to get an ordered list of ids with strengths.
> >>
> >> It looks to me like you are getting preference data from the user,
> which in this case is 1 or 0—not recommendations.
> >>
> >> On May 3, 2014, at 7:42 AM, Sebastian Schelter <ssc@apache.org> wrote:
> >>
> >> You should try the
> >>
> >>
> org.apache.mahout.cf.taste.impl.recommender.GenericBooleanPrefUserBasedRecommender
> >>
> >> which has been built to handle such data.
> >>
> >> Best,
> >> Sebastian
> >>
> >>
> >> On 05/03/2014 04:34 PM, Alessandro Suglia wrote:
> >>> I have described it in the SO's post:
> >>> "When I execute this code, the result is a list of 0.0 or 1.0 which are
> >>> not useful in the context of top-n recommendation in implicit feedback
> >>> context. Simply because I have to obtain, for each item, an estimated
> >>> rate which stays in the range [0, 1] in order to rank the list in
> >>> decreasing order and construct the top-n recommendation appropriately."
> >>> On 05/03/14 16:25, Sebastian Schelter wrote:
> >>>> Hi Allessandro,
> >>>>
> >>>> what result do you expect and what do you get? Can you give a concrete
> >>>> example?
> >>>>
> >>>> --sebastian
> >>>>
> >>>> On 05/03/2014 12:11 PM, Alessandro Suglia wrote:
> >>>>> Good morning,
> >>>>> I've tried to create a recommender system using Mahout in an implicit
> >>>>> feedback situation. What I'm trying to do is explained exactlly
in
> this
> >>>>> post on stack overflow:
> >>>>>
> http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation
> .
> >>>>>
> >>>>> <
> http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation
> >
> >>>>>
> >>>>>
> >>>>> As you can see, I'm having some problem with it simply because I
> cannot
> >>>>> get the result that I expect (a value between 0 and 1) when I try
to
> >>>>> predict a score for a specific item.
> >>>>>
> >>>>> Someone here can help me, please?
> >>>>>
> >>>>> Thank you in advance.
> >>>>>
> >>>>> Alessandro Suglia
> >>>>>
> >>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message