mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: TreeBasedRecommenders(Deprecated?)
Date Tue, 10 Jun 2014 23:07:36 GMT
There are simple ways to do this without maintaining a separate recommender.

First you can simply cluster the input matrix of users by items. Then recommend items closest
to the centroid of the cluster the user’s couple of items were in. But this seems dubious
for several reasons.

Better yet (maybe controversial since I don’t know the mathematical justification for this)
but you could cluster the indicator matrix of items by similar items. This is at least clustering
“important” similar items.

But it is even easier than clustering if you know a couple items the user has preferred just
get the most similar to those directly from the indicator matrix. The indicator matrix is
organized by an item per row and each row has similar items by strength of similarity. Add
all the rows the user has interacted with (using the strength values), sort, and recommend
the top n. The in-memory item-based recommender will give you the similar items for each item
the user preferred, all you need to do is add an sort.

To truly solve the cold start problem you have items and/or users with no interactions. This
calls for a metadata recommender and some context. If a user is on a page of a product with
no interactions, the metadata must tell which items are similar. In the case where you have
a user with no interactions and no context, you have to rely on things like the time-worn
popular and trending items.

You are certainly welcome here but questions like this usually go to the

On Jun 10, 2014, at 4:50 AM, Sahil Sharma <> wrote:


One place where tree based recommenders(that is using hierarchical
clustering) might be useful is a cold start problem.  That is suppose a
user has only bought a few items ( say 2 or 3)  It's kind of hard to
capture that user's interests using a user-based collaborative filtering
Also the use of item-based collaborative filtering recommender turns out to
be time consuming.
In such a setting it makes sense to cluster the items together ( using some
clustering algorithm)  and then use the user's purchased item to
recommend(based on which cluster those purchased items belong to).
On Jun 10, 2014 4:41 PM, "Sebastian Schelter" <> wrote:

> Hi Sahil,
> don't worry, you're not breaking any rules. We removed the tree-based
> recommenders because we have never heard of anyone using them over the
> years.
> --sebastian
> On 06/10/2014 09:01 AM, Sahil Sharma wrote:
>> Hi,
>> Firstly I apologize if I'm breaking certain rules by mailing this way, I'm
>> new to this and would appreciate any help I could get.
>> I was just playing around with the tree-based Recommender ( which seems to
>> be deprecated in the current version "for the lack of use" ) .
>> Why was it deprecated?
>> Also, I just looked at the code, and it seems to be doing a lot of
>> redundant computations, for example we could store a matrix of
>> cluster-cluster distances ( and hence avoid recomputing the closest
>> clusters every time by updating the matrix whenever we merge two clusters)
>> and also , when trying to determine the farthest distance based similarity
>> between two clusters again the pair which realizes this could be stored ,
>> and updated upon merging so that this computation need not to repeated
>> again and again.
>> Just wondering if this repeated computation was not a reason for
>> deprecating the class ( since people might have found a slow recommender
>> "lacking use" ) .
>> Would be glad to hear the thoughts of others on this, and also implement
>> an
>> efficient version if the community agrees.

View raw message