mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serega Sheypak <serega.shey...@gmail.com>
Subject recommenditembased returns 0 records from last map-reduce job
Date Sun, 20 Jul 2014 18:57:35 GMT
Hi, I'm trying to create item similarity.
I gather items which users visit during shopping and then create a file:
user_id, item_id, weight (where weight can be: [1.0, 1.6, 1.9], depends on
user action type and data source)
UNION
-item_id, item_id, 1.0 (from items dictionary)

and I do provide a userFile, where user_id = -item_id

The idea is to get item similary. If any user visits item named "A", i want
to show him items "B", "c", "xxx" using preferences of other users.

The problem is that the last (???) mapreduce job returns 0 rows:

Here are my settings:


sudo -u oozie mahout recommenditembased \
                    --input visited_items_with_inverted_items \

                    --output result \
                    --similarityClassname SIMILARITY_LOGLIKELIHOOD \
                    --usersFile inverted_items \
                    --numRecommendations 500 \
                    --booleanData false \
                    --maxPrefsPerUser 100 \
                    --maxSimilaritiesPerItem 500 \
                    --minPrefsPerUser 0\
                    --maxPrefsPerUserInItemSimilarity 30 \
                    --threshold 0.91 \
                    --tempDir  temp \

Some counters... I don't get what do they mean....

14/07/20 22:43:08 INFO mapred.JobClient:
  org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducer$Counters

14/07/20 22:43:08 INFO mapred.JobClient:     USERS=7528530

14/07/20 22:43:43 INFO mapred.JobClient:
  org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper$Elements

14/07/20 22:43:43 INFO mapred.JobClient:
    USER_RATINGS_NEGLECTED=1,798,738

14/07/20 22:43:43 INFO mapred.JobClient:     USER_RATINGS_USED=12,429,693


14/07/20 22:44:24 INFO mapred.JobClient:
  org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters

14/07/20 22:44:24 INFO mapred.JobClient:     ROWS=3312879

14/07/20 22:45:18 INFO mapred.JobClient:
  org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters

14/07/20 22:45:18 INFO mapred.JobClient:     COOCCURRENCES=35882374

14/07/20 22:45:18 INFO mapred.JobClient:     PRUNED_COOCCURRENCES=0

14/07/20 22:46:00 INFO mapred.JobClient:     Map input records=3312879

14/07/20 22:46:00 INFO mapred.JobClient:     Map output records=17570268

14/07/20 22:46:00 INFO mapred.JobClient:     Reduce input records=5221907

14/07/20 22:46:00 INFO mapred.JobClient:     Reduce output records=3312879


14/07/20 22:46:34 INFO mapred.JobClient:     Reduce input records=3312879

14/07/20 22:46:34 INFO mapred.JobClient:     Reduce output records=3312879

14/07/20 22:46:34 INFO mapred.JobClient:     Reduce input records=3312879

14/07/20 22:46:34 INFO mapred.JobClient:     Reduce output records=3312879

14/07/20 22:47:06 INFO mapred.JobClient:     Map input records=7528530

14/07/20 22:47:06 INFO mapred.JobClient:     Map output records=3313251

14/07/20 22:47:06 INFO mapred.JobClient:     Reduce input records=3313251

14/07/20 22:47:06 INFO mapred.JobClient:     Reduce output records=3313251

14/07/20 22:47:40 INFO mapred.JobClient:     Map input records=6626130

14/07/20 22:47:40 INFO mapred.JobClient:     Map output records=6626130

14/07/20 22:47:40 INFO mapred.JobClient:     Reduce input records=6626130

14/07/20 22:47:40 INFO mapred.JobClient:     Reduce output records=3312879


14/07/20 22:48:26 INFO mapred.JobClient:     Map input records=3312879

14/07/20 22:48:26 INFO mapred.JobClient:     Map output records=3313251

14/07/20 22:48:26 INFO mapred.JobClient:     Reduce input records=3313251

--------
14/07/20 22:48:26 INFO mapred.JobClient:     Reduce output records=0
--------

why 0???

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message