mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cassio Melo <melo.cas...@gmail.com>
Subject Re: Decaying score for old preferences when using the .refresh()
Date Wed, 20 Nov 2013 12:28:22 GMT
Hi guys, thanks for sharing your experiences on this subject, really
appreciated. To summarize the discussion:

- The decay of old preference values might loose important historical data
in cases where the user has no recent activity (Gokhan)
- When using decay (or truncate preferences), the precision of rating
prediction may be lower (Pat, Gokhan, Ted) but it might increase conversion
rates (Gokhan, Pat) since it reflects recent user intent.
- Tweaking the score estimation may be a better approach (Gokhan)

I'm doing some experiments with e-commerce data, I'll post the results
later.

Best regards,
Cassio


On Fri, Nov 8, 2013 at 5:08 PM, Pat Ferrel <pat.ferrel@gmail.com> wrote:

> > I think the intuition here is, when making an item neighborhood base
> > recommendation, to penalize the contribution of the items that the user
> has
> > rated a long time ago. I didn't test this in a production recommender
> > system, but I believe this might result in recommendation lists with
> better
> > conversion rates in certain use cases.
>
> It’s only one data point but it was a real ecom recommender with real user
> data. We did not come to the conclusion above, though there is some truth
> in it.
>
> There are two phenomena at play, similarity of users and items, and recent
> user intent. Similarity of users decays very slowly if at all. The fact
> that you and I bought an iPhone 1 makes us similar even though the iPhone 1
> is no longer for sale. However you don’t really want to rely on user
> activity that old to judge recent shopping intent. Mahout conflates these
> unfortunately.
>
> Back to the canonical R = [B’B]H; [B’B] is actually calculated using some
> similarity metric like log-likihood and RowSimilarityJob.
> B = preference matrix; user = row, item = column, value = strength perhaps
> 1 for a purchase.
> H = user history of preferences in columns, rows = items
>
> If you did nothing to decay preferences B’=H
>
> If you truncate to use only recent preferences in H then B’ != H
>
> Out of the box Mahout requires B’=H, and we got significantly lower
> precision scores by decaying BOTH B and H. Our conclusion was that this was
> not really a good idea given our data.
>
> If you truncate user preferences to some number of the most recent in H
> you probably get a lower precision score (as Ted mentions) but our
> intuition was that the recommendations reflect the most recent user intent.
> Unfortunately we haven’t A/B tested this conclusion but the candidate for
> best recommender was using most recent prefs in H and all prefs in B.
>
> > On Nov 7, 2013, at 11:36 PM, Gokhan Capan <gkhncpn@gmail.com> wrote:
>
> On Fri, Nov 8, 2013 at 6:24 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > On Thu, Nov 7, 2013 at 12:50 AM, Gokhan Capan <gkhncpn@gmail.com> wrote:
> >
> >> This particular approach is discussed, and proven to increase the
> > accuracy
> >> in "Collaborative filtering with Temporal Dynamics" by Yehuda Koren. The
> >> decay function is parameterized per user, keeping track of how
> consistent
> >> the user behavior is.
> >>
> >
> > Note that user-level temporal dynamics does not actually improve the
> > accuracy of ranking. It improves the accuracy of ratings.
>
>
> Yes, the accuracy of rating prediction.
>
> Since
> > recommendation quality is primarily a precision@20 sort of activity,
> > improving ratings does no good at all.
>
>
> > Item-level temporal dynamics is a different beast.
> >
>
> I think the intuition here is, when making an item neighborhood base
> recommendation, to penalize the contribution of the items that the user has
> rated a long time ago. I didn't test this in a production recommender
> system, but I believe this might result in recommendation lists with better
> conversion rates in certain use cases.
>
> Best
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message