mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: User and Item based recommender questions - real time updates & weighting similar items
Date Fri, 28 Sep 2012 10:29:11 GMT
The general model is that new input is read (from a file, DB, etc.)
only periodically. And that is accompanied by clearing out caches and
recomputing a lot of intermediate results. Updating in real-time is
possible, but comes at a big performance trade-off. So you have this
quasi-batch update process.

This model is true of all the recommendations. I am not sure I'd use
TreeClusteringRecommender; it is a naive implementation that is going
to be way slow in the model building phase -- exactly what you are
hoping to do frequently.

(I think this is an OK point for an ad break, since it's exactly on
topic: the problem of real-time updates is quite common and there's
not a great solution in the framework. Cold start and data fold-in are
two problems I wanted to explicitly solve in Myrrix
(, a  recommender product / project with the same
Mahout APIs and a different architecture and algorithm that makes this
easy. The piece you need is free and open source and would suggest you
take a look, if this is your primary requirement.)

You can't explicitly weight prefs in mostSimilarItems, but you can
pass several values, and you can pass some values multiple times, as a
crude way to effect a weighting.

On Fri, Sep 28, 2012 at 11:20 AM, David Parks <> wrote:
> I have two questions concerning User Based Recommenders and Item Based
> Recommenders:
> In a User Based Recommender (in production after the model is computed), I
> will receive a query for a User Based Recommendation that is based on newly
> generated User Preference Data.
> From my study so far it seems like I should try TreeClusteringRecommender to
> start with, but how do I use the users most recent preference data to
> generate a result in real time (within a single web transaction)? I need to
> update the model in each query right? E.g.
> myTreeClusteringRecommender.refresh()?
> In an Item Based Recommender I can call recommender.mostSimilar(itemIDs)
> with a set of items that the user has expresses preference for (most recent
> preference data).
> Is there a way I can weight these preferences? For example a user might have
> already clicked on 2 items, and just looked at 3 others. If this is my
> itemIDs set, the first two should affect the recommendation more than the
> other 3.

View raw message