mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen" <>
Subject Re: Recommending when working with binary data sets
Date Tue, 30 Sep 2008 17:28:54 GMT
Agree. I have not seen this patent. I base my work on public research
and academic papers. I have no reason to believe there are any patent
issues here.

On 9/30/08, Grant Ingersoll <> wrote:
> I adhere to Doug's philosophy on patents, and refuse to look at them
> or do searches for them.  I am not a lawyer, nor a judge, nor a patent
> officer, and am thus completely unqualified to even begin to venture
> an opinion.
> Sorry,
> Grant
> On Sep 30, 2008, at 1:09 PM, Otis Gospodnetic wrote:
>> Hello,
>> Thanks for the pointers, Grant.  Regarding that Amazon item-item
>> recommendation.  It looks like that's patented:
>> Does that mean one cannot implement this in Taste (or any other
>> piece of software)?  Even if used in non-shopping purposes?
>> Thanks,
>> Otis
>> ----- Original Message ----
>>> From: Grant Ingersoll <>
>>> To:
>>> Sent: Monday, September 29, 2008 9:43:58 AM
>>> Subject: Re: Recommending when working with binary data sets
>>> Not sure I know the answer in terms of Taste, but did a little bit of
>>> digging (mind you, I'm no CF expert, but I'm learning thanks to Taste
>>> and Sean).
>>> At any rate, came across:
>>> Started at Wikipedia's page:
>>> which lead to, which then has
>>> an interesting comment about Amazon's item-item approach, which, via
>>> Google Scholar leads to:
>>> In particular, see the "How it Works" section.  Essentially, it
>>> describes how they build the item to item similarity matrix, which I
>>> believe is also what you need to do.
>>> HTH,
>>> Grant
>>> On Sep 26, 2008, at 1:52 PM, Otis Gospodnetic wrote:
>>>> Hi,
>>>> I've been reading the chapter on recommendations in Programming
>>>> Collective Intelligence and looking at Taste.  The examples in PCI
>>>> all assume scenarios where items to recommend have been rated by
>>>> users on some scale.  I understand how items can be recommended to
>>>> users using item-based filtering and user-item ratings and why this
>>>> is preferred over user-based filtering when the number of users is
>>>> larger than the number of items.
>>>> But what if all I've got is item-item similarity (content-based) and
>>>> there are no user-item ratings?  Say I have a situation where people
>>>> simply either consume content (e.g. read an article, watch a
>>>> video...) or not consume it (don't read an article, don't watch the
>>>> video...).  In other words, I really have only yes/no or 1/0 or
>>>> seen/
>>>> not seen type "rating".
>>>> I can't really use Euclidean distance or Pearson correlation
>>>> coefficient, can I?
>>>> What do people use in such scenarios?  Would it make sense to use
>>>> for such cases?
>>>> ... Ah, I do see javadoc in TanimotoCoefficientSimilarity saying
>>>> exactly that, good.
>>>> But then my question is:
>>>> Doesn't the use of Jaccard/Tanimoto mean going back to the expensive
>>>> user-user similarity computation?
>>>> That is, if I need to recommend items for user U1 don't I need to:
>>>> 1) have user-user similarity pre-computed (and recomputed
>>>> periodically)
>>>> 2) find top N users U{2,3,4,...N} who are the most similar to U1
>>>> 3) then for these top N most similar users find their "seen" items
>>>> that U1 has not seen (possibly limit this to only recently seen
>>>> items)
>>>> 4) select top N items from 3) and recommend those to U1.
>>>> If so, then 1) is again expensive.
>>>> And what how would one go about selecting top N items from the list
>>>> in this case other than ordering them by user-user similarity?
>>>> Of course, something is telling me I'm demonstrating that I don't
>>>> yet have the full grasp of item-based filtering.  I hope that's the
>>>> case! :)
>>>> Thanks,
>>>> Otis

View raw message