mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen" <sro...@gmail.com>
Subject Re: Recommending when working with binary data sets
Date Tue, 30 Sep 2008 17:28:54 GMT
Agree. I have not seen this patent. I base my work on public research
and academic papers. I have no reason to believe there are any patent
issues here.

On 9/30/08, Grant Ingersoll <gsingers@apache.org> wrote:
> I adhere to Doug's philosophy on patents, and refuse to look at them
> or do searches for them.  I am not a lawyer, nor a judge, nor a patent
> officer, and am thus completely unqualified to even begin to venture
> an opinion.
>
> Sorry,
> Grant
>
>
> On Sep 30, 2008, at 1:09 PM, Otis Gospodnetic wrote:
>
>> Hello,
>>
>> Thanks for the pointers, Grant.  Regarding that Amazon item-item
>> recommendation.  It looks like that's patented:
>>
>> http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=2&p=1&f=G&l=50&d=PTXT&S1=amazon.ASNM.&OS=AN/(amazon)&RS=AN/amazon
>>
>> Does that mean one cannot implement this in Taste (or any other
>> piece of software)?  Even if used in non-shopping purposes?
>>
>>
>> Thanks,
>> Otis
>>
>>
>> ----- Original Message ----
>>> From: Grant Ingersoll <gsingers@apache.org>
>>> To: mahout-user@lucene.apache.org
>>> Sent: Monday, September 29, 2008 9:43:58 AM
>>> Subject: Re: Recommending when working with binary data sets
>>>
>>> Not sure I know the answer in terms of Taste, but did a little bit of
>>> digging (mind you, I'm no CF expert, but I'm learning thanks to Taste
>>> and Sean).
>>>
>>> At any rate, came across:
>>> Started at Wikipedia's page:
>>> http://en.wikipedia.org/wiki/Collaborative_filtering
>>>
>>> which lead to http://en.wikipedia.org/wiki/Slope_One, which then has
>>> an interesting comment about Amazon's item-item approach, which, via
>>> Google Scholar leads to:
>>>
>>> http://dsonline.computer.org/portal/site/dsonline/menuitem.9ed3d9924aeb0dcd82ccc6716bbe36ec/index.jsp?&pName=dso_level1&path=dsonline/2003_Archives/0301/d&file=wp1lind.xml&xsl=article.xsl&;jsessionid=LghY1grHgYJpBTLpWjX5NtvQwhH1Bkv9rpfXT4VnpVtDNVpfZ8n0!-1404507079
>>>
>>> In particular, see the "How it Works" section.  Essentially, it
>>> describes how they build the item to item similarity matrix, which I
>>> believe is also what you need to do.
>>>
>>> HTH,
>>> Grant
>>>
>>> On Sep 26, 2008, at 1:52 PM, Otis Gospodnetic wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been reading the chapter on recommendations in Programming
>>>> Collective Intelligence and looking at Taste.  The examples in PCI
>>>> all assume scenarios where items to recommend have been rated by
>>>> users on some scale.  I understand how items can be recommended to
>>>> users using item-based filtering and user-item ratings and why this
>>>> is preferred over user-based filtering when the number of users is
>>>> larger than the number of items.
>>>> But what if all I've got is item-item similarity (content-based) and
>>>> there are no user-item ratings?  Say I have a situation where people
>>>> simply either consume content (e.g. read an article, watch a
>>>> video...) or not consume it (don't read an article, don't watch the
>>>> video...).  In other words, I really have only yes/no or 1/0 or
>>>> seen/
>>>> not seen type "rating".
>>>>
>>>> I can't really use Euclidean distance or Pearson correlation
>>>> coefficient, can I?
>>>>
>>>> What do people use in such scenarios?  Would it make sense to use
>>> http://en.wikipedia.org/wiki/Jaccard_index
>>>> for such cases?
>>>> ... Ah, I do see javadoc in TanimotoCoefficientSimilarity saying
>>>> exactly that, good.
>>>>
>>>> But then my question is:
>>>> Doesn't the use of Jaccard/Tanimoto mean going back to the expensive
>>>> user-user similarity computation?
>>>>
>>>> That is, if I need to recommend items for user U1 don't I need to:
>>>> 1) have user-user similarity pre-computed (and recomputed
>>>> periodically)
>>>> 2) find top N users U{2,3,4,...N} who are the most similar to U1
>>>> 3) then for these top N most similar users find their "seen" items
>>>> that U1 has not seen (possibly limit this to only recently seen
>>>> items)
>>>> 4) select top N items from 3) and recommend those to U1.
>>>>
>>>> If so, then 1) is again expensive.
>>>> And what how would one go about selecting top N items from the list
>>>> in this case other than ordering them by user-user similarity?
>>>>
>>>> Of course, something is telling me I'm demonstrating that I don't
>>>> yet have the full grasp of item-based filtering.  I hope that's the
>>>> case! :)
>>>>
>>>> Thanks,
>>>> Otis
>>>
>>>
>

Mime
View raw message