mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gruszowska Natalia <Natalia.Gruszow...@grupaonet.pl>
Subject Collaborative filtering item-based in mahout - without isolating users
Date Wed, 10 Dec 2014 15:40:43 GMT
Hi All,

In mahout there is implemented method for item based Collaborative filtering called itemsimilarity,
which returns the "similarity" between each two items.
In the theory, similarity between two items should be calculated only for users who ranked
both items. During testing I realized that in mahout it works different.
Below two examples.

Example 1. items are 11-12
In below example the similarity between item 11 and 12 should be equal 1, but mahout output
is 0.36. It looks like mahout treats null as 0.
Similarity between items:
101     102     0.36602540378443865

Matrix with preferences:
            11       12
1                     1
2                     1
3           1         1
4                     1

Example 2. items are 101-103.
Similarity between items 101 and 102 should be calculated using only ranks for users 4 and
5, and the same for items 101 and 103 (that should be based on theory). Here (101,103) is
more similar than (101,102), and it shouldn't be.
Similarity between items:
101     102     0.2612038749637414
101     103     0.4340578302732228
102     103     0.2600070276638468

Matrix with preferences:
            101      102        103
1                     1         0.1
2                     1         0.1
3                     1         0.1
4           1         1         0.1
5           1         1         0.1
6                     1         0.1
7                     1         0.1
8                     1         0.1
9                     1         0.1
10                    1         0.1


Both examples were run without any additional parameters.
Is this problem solved somewhere, somehow? Any ideas? Why null is treated as 0?
Source: http://files.grouplens.org/papers/www10_sarwar.pdf



Kind regards,
Natalia Gruszowska



Mime
View raw message