mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emilio Suarez <Emilio.Sua...@intela.com>
Subject Re: Recommender with ratings takes a long time to process
Date Fri, 11 May 2012 17:47:54 GMT
Thanks Sean,

So, do you suggest something like this?

        LogLikelihoodSimilarity similarity = new LogLikelihoodSimilarity(fileDataModel);
        PreferredItemsNeighborhoodCandidateItemsStrategy candidateStrategy = new PreferredItemsNeighborhoodCandidateItemsStrategy();
        recommender = new GenericItemBasedRecommender(fileDataModel, similarity, candidateStrategy,
candidateStrategy);

or this?

        LogLikelihoodSimilarity similarity = new LogLikelihoodSimilarity(fileDataModel);
        SamplingCandidateItemsStrategy candidateStrategy = new SamplingCandidateItemsStrategy();
        recommender = new GenericItemBasedRecommender(fileDataModel, similarity, candidateStrategy,
candidateStrategy);


-emilio

You need to apply a CandidateItemStrategy to reduce the number of
elements you consider, or else it will take a very long time because
almost the entire model is a candidate for recommendation.

On Fri, May 11, 2012 at 6:18 PM, Emilio Suarez <Emilio.Suarez@intela.com<mailto:Emilio.Suarez@intela.com>>
wrote:
Hi there,

The usual setting for the Mahout recommendation input file is:
user, item, rating

Now, for the purposes of my application, what I really wanted was a recommendation of users
for a specific item, so my input files are:
item, user, rating

My input CSV file contains the following stats:

model file: 560,901 records
item "24441": 31,585 records
rating contains one of 3 values: 1, 2 or 3

When I ask for a recommendation of users for item "24441", these are the results:

total recommended "users": 50,162
Elapsed time: 3h 13m

As you can seeā€¦ this is a very long time processingā€¦  and this all started when I added
"ratings" to the input files.
Before I was using the recommender with GenericBooleanPrefItemBasedRecommender, and the process
would run in minutes.
Now with the ratings, I am using the following:

       LogLikelihoodSimilarity similarity = new LogLikelihoodSimilarity(fileDataModel);
       AllSimilarItemsCandidateItemsStrategy candidateStrategy = new AllSimilarItemsCandidateItemsStrategy(similarity);
       recommender = new GenericItemBasedRecommender(fileDataModel, similarity, candidateStrategy,
candidateStrategy);

I have another input file with the following stats:

model file: 276,543 records
item "11205": 5,968 records
rating contains one of 3 values: 1, 2 or 3

and when I ask for a recommendation of users for item "11205", these are the results:

total recommended "users": 26,083
Elapsed time: 23m

As you can see, the difference is size is just 2x, but the time difference is 8x !!!

Is this the expected behavior for the recommender to take this long?
Is there anything I can do to speed up the process?

Thanks

-emilio


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message