mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide Pozza <davide.po...@gmail.com>
Subject difference between precomputed and on-the-fly processed data
Date Thu, 20 Sep 2012 13:04:47 GMT
Hello

I'm trying to understand how to develop a item-based recommendation module
for an ecommerce website.

Here's my input data.csv file format:

USER_ID,ITEM_ID

(data coming from the orders history, so I haven't any rating to use)

If I correctly understand the documentation, the following implementations
should be equivalent (the first one just uses the precomputed data), but
they return different results.
Could anyone help me to understand the reason?

FIRST IMPLEMENTATION
====================
DataModel dataModel = new FileDataModel(new File("data.csv"));//FORMAT
user_id,item_id

//precomputed data generated by ItemSimilarityJob with
SIMILARITY_LOGLIKELIHOOD
ItemSimilarity similarity = new FileItemSimilarity(new
File("precomputed_data"));

GenericItemBasedRecommender recommender =
    new GenericItemBasedRecommender(dataModel, similarity);

long userId = 8500003;
List<RecommendedItem> recommendations =
    recommender.recommend(userId , 5);
for (RecommendedItem recommendation : recommendations){
    System.out.println(recommendation);
}

==RESULT==
RecommendedItem[item:1653, value:1.0]
RecommendedItem[item:14, value:1.0]
RecommendedItem[item:1592, value:1.0]
RecommendedItem[item:25, value:1.0]
RecommendedItem[item:43, value:1.0]

SECOND IMPLEMENTATION
======================
DataModel dataModel = new FileDataModel(new File("data.csv"));//FORMAT
user_id,item_id

ItemSimilarity similarity = new LogLikelihoodSimilarity(dataModel);

GenericItemBasedRecommender recommender =
    new GenericItemBasedRecommender(dataModel, similarity);

long userId = 8500003;
List<RecommendedItem> recommendations =
       recommender.recommend(userId , 5);
for (RecommendedItem recommendation : recommendations){
System.out.println(recommendation);
}

==RESULT==
RecommendedItem[item:28, value:1.0]
RecommendedItem[item:14, value:1.0]
RecommendedItem[item:20, value:1.0]
RecommendedItem[item:21, value:1.0]
RecommendedItem[item:25, value:1.0]

-- 
Davide Pozza

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message