mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Is Taste appropriate for...
Date Tue, 09 Sep 2008 21:57:19 GMT
Hi,

Taste is appropriate for scenarios where there are users and where these users have item preferences.
 But is it appropriate for scenarios where there are no users in the game?  For example, instead
of looking at item preferences of authenticated users, could you look at, say, Amazon shopping
carts to figure out which books people buy together to arrive at the "People who bought Lucene
in Action also bought Managing Gigabytes" recommendation?  In other words, could you simply
look at item-item correlation without paying any attention to users?

Based on http://lucene.apache.org/mahout/taste.html#Item-based+Recommender I'd think this
is possible, but I looked at some of the classes mentioned in the example there, and they
all have references to User objects.  Does that mean item-item recommendations like in my
example above are not possible with Taste?

I do see GenericItemSimilarity.ItemItemCorrelation, but even there I see references to DataModel
class which references the User class.  Perhaps for item-item recommendations one can simply
not use the ctors with the DataModel argument?  Is that the idea?

Also, is the idea that something (e.g. my app) calculates "correlatedness" (that float) of
2 Items and feeds that to GenericItemSimilarity.ItemItemCorrelation's ctor?  If so, what exactly
does GenericItemSimilarity.ItemItemCorrelation do?  Doesn't it then simply serve the purpose
of finding top N most correlated items for any given item?  Actually, I see only this:

  public double itemCorrelation(Item item1, Item item2)

So it's not really a structure that gives top N items for a given Item.  Is there a way to
get Taste to do that?  If my app has to calculate how related 2 items are, then is there is
a need for Taste in this purely item-item scenario?

Thanks,
Otis
P.S.
A few
The doc on the site mentions ItemCorrelation, but there is no such class.  There may be other
missing classes, I didn't check closely.
I saw this in the PearsonCorrelationSimilarity:

 * <p><code>sumXY / sqrt(sumX2 * sumY2)</code></p>
 *
 * <p>where <code>size</code> is the number of {@link Item}s in the {@link
DataModel}.</p>

Is that really "size"?  Or should it be sumSomething?

Mime
View raw message