mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Recommender system implementations
Date Fri, 22 Oct 2010 07:28:06 GMT
Yah I still think held-out data is the best thing, if you want to use this
built-in evaluation mechanism. Hold out the same data from both models and
run the same test.

There is another approach which doesn't necessarily require held-out data.
On the original, full model, just compute recommendations for any users you
like. Assume these are "correct". Then do the same for the derived model.

It will return to you estimated preferences in both cases. You could use the
deltas as a measure of "error" (unless your derived model has quite a
different rating space).

Or simply use the difference in rankings -- compute some metric that
penalizes having recommendations in different places in the ordering.

I'll say I don't know which of these is most mathematically sound.
Interpreting the results may be hard. But, any of these should give a notion
of "better" and "worse".

Assuming the original model's recommendations are "correct" is a reasonably
big one. For example, the whole point of an SVD recommender is to modify the
model (reduce its dimension really) in order to be able to recommend items
that should be recommended, but weren't before due to model sparseness.
There, transforming the data in theory gives better results. That it's
different doesn't mean worse necessarily.

But maybe that's not an issue for your use case, don't know.

On Fri, Oct 22, 2010 at 5:39 AM, Lance Norskog <> wrote:

> Here is my use case: I have two data models.
> 1) the original data, for example GroupLens
> 2) the derivative. This is a second data model which is derived from
> the original. It is made with a one-way function from the master.
> I wish to measure how much information is lost in the derivation
> function. There is some entropy, so therefore the derived data model
> cannot supply recommendations as good as the original data. But how
> much worse?
> My naive method is to make recommendations using the master model, and
> the derived model, and compare them. If the recommendations from the
> derived model are, say, 90% as good as from the original data, then
> the derivation function is ok.
> Now, obviously, the gold standard for recommendations is the data in
> the original model. So, I make recommendations from the original, and
> the derived, from the user/item prefs given in the original data. I
> don't really care about what the user gave as preferences: I want to
> know what the recommender algorithm itself thinks. But the
> recommenders just parrot back the data model instead of giving me
> their own opinion. Thus, the point of this whole thread. But how
> recommender algorithms work is a side issue. I'm trying to use them as
> an indirect measurement of something else.
> What is another way to test what I'm trying to test? What is another
> way to evaluate the quality of my derivation function?
> On Wed, Oct 20, 2010 at 11:41 PM, Sebastian Schelter <>
> wrote:
> > Hi Lance,
> >
> > When evaluating a recommender you should split your dataset in a training
> > and test part. Only data from the training part should be included in
> your
> > DataModel and you only measure the accuracy of predicting  ratings that
> are
> > included in the test part (which is not  known by your recommender). If
> you
> > structure things this way, the current implementation should work fine
> for
> > you.
> >
> > --sebastian
> >
> > On 21.10.2010 04:56, Lance Norskog wrote:
> >>
> >> Since this is Recommender day, here is another kvetch:
> >>
> >> The recommender implementations with algorithms all do this in
> >> Recommender.estimatePreference():
> >>  public float estimatePreference(long userID, long itemID) throws
> >> TasteException {
> >>     DataModel model = getDataModel();
> >>     Float actualPref = model.getPreferenceValue(userID, itemID);
> >>     if (actualPref != null) {
> >>       return actualPref;
> >>     }
> >>     return doEstimatePreference(userID, itemID);
> >>   }
> >>
> >> Meaning: "if I told you something, just parrot it back to me."
> >> Otherwise, make a guess.
> >>
> >> I am doing head-to-head comparisons of the dataModel preferences v.s.
> >> the Recommender. This code makes it impossible to directly compare
> >> what the recommender thinks v.s. the actual preference. If I wanted to
> >> know what I told it, I already know that. I want to know what the
> >> recommender thinks.
> >>
> >> If this design decision is something y'all have argued about and
> >> settled on, never mind. If it is just something that seemed like a
> >> good idea at the time, can we change the recommenders, and the
> >> Recommender "contract", to always use their own algorithm?
> >>
> >>
> >
> >
> --
> Lance Norskog

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message