mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gökhan Çapan <>
Subject Re: Extracting association rules
Date Thu, 15 Apr 2010 12:48:57 GMT
Hi Sebastian,
We have a system that is similar to what you have asked.

Using GenericItemBasedRecommender's mostSimilarItems method with Jaccard
Index as a similarity measure, you can compute support values of item pairs,
and list top n item pairs those may be produced by that item.
The set of all such item pairs are the frequent items in terms of support.

Or you can use LogLikelihoood, which I am using for a query recommendation
engine and gives pretty good results while discovering similar queries.
We are recommending similar queries to a query if it is from an anonymous
user(mostSimilarItems(query)), or compute recommendation if we know the
user, based on queries which are similar to queries in the user's history.

If you want to discover similar item pairs(just pairs) mostSimilarItems may
be a good choice.(I am using it)

Please correct me if I am wrong, but I guess fp-growth returns set of all
frequent items at once.
*If it does so;*
if you want compute similar items to an item at runtime(don't want to store
all similar items), I can suggest you using mostSimilarItems function.
But if you want to compute frequent item sets and store it to use later,
fp-growth is implemented in paralel and it seems more suitable for your

On Thu, Apr 15, 2010 at 3:23 PM, Sean Owen <> wrote:

> The framework is pretty general, so yeah you can get it to do most
> anything, though some things might need more custom code than others.
> Viewed generally, a recommender takes as input associations from As to
> Bs, and then given an A, predicts new associations to Bs. Usually we
> think of As as users and Bs and items. But you could let As be browsed
> items, and Bs be items that were ultimately purchased by users who
> browsed A.
> Then this is a recommender problem, not merely a simpler
> most-similar-items problem. Given an item being browsed, you can
> recommend items that are most likely to be purchased.
> The work you'd have to do is simply assembling these associations in
> the first place. You'd dig through your purchase and browsing data,
> and output all item-item pairs where item 1 is a browsed item and item
> 2 is an item that was ultimately purchased by one or more users who
> browsed the first item. The value might be the number of users who fit
> this description.
> Once you have that input you can throw any of the recommenders at it
> to produce the output. You'd have more choice, including distributed
> recommenders, and have access to evaluators as well. No custom code
> ought to be needed unless you want to.
> On Thu, Apr 15, 2010 at 1:10 PM, Sebastian Feher <>
> wrote:
> > There are a few questions that I'm not able to answer:
> > - do you support cross-type frequent item sets? for example - people who
> Browsed this item - ended up purchasing these items. In this case the item
> pairs are generated by taking one item from the Browse space and the other
> from Purchase space. Is this something that can be achieved with the current
> algorithms(GenericItemBasedRecommender.mostSimilarItems(), FP-Growth) in
> there existing form and if not there an extension mechanism that allows me
> to do that in a clean fashion or do I have to modify the algorithm code?

Gökhan Çapan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message