mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Need to reduce execution time of RowSimilarityJob
Date Mon, 01 Oct 2012 09:28:33 GMT
Similar items, right? You should look at the vectors that have 1.0
similarity and see if they are in fact collinear. This is still by far
the most likely explanation. Remember that the vector similarity is
computed over elements that exist in both vectors only. They just have
to have 2 identical values for this to happen.

On Mon, Oct 1, 2012 at 10:25 AM, yamo93 <yamo93@gmail.com> wrote:
> For each item, i have 10 recommended items with a value of 1.0.
> It sounds like a bug somewhere.
>
>
> On 10/01/2012 11:06 AM, Sean Owen wrote:
>>
>> It's possible this is correct. 1.0 is the maximum similarity and
>> occurs when two vector are just a scalar multiple of each other (0
>> angle between them). It's possible there are several of these, and so
>> their 1.0 similarities dominate the result.
>>
>> On Mon, Oct 1, 2012 at 10:03 AM, yamo93 <yamo93@gmail.com> wrote:
>>>
>>> I saw something strange : all recommended items, returned by
>>> mostSimilarItems(), have a value of 1.0.
>>> Is it normal ?

Mime
View raw message