mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Recommender with ratings takes a long time to process
Date Fri, 11 May 2012 17:20:33 GMT
You need to apply a CandidateItemStrategy to reduce the number of
elements you consider, or else it will take a very long time because
almost the entire model is a candidate for recommendation.

On Fri, May 11, 2012 at 6:18 PM, Emilio Suarez <Emilio.Suarez@intela.com> wrote:
> Hi there,
>
> The usual setting for the Mahout recommendation input file is:
> user, item, rating
>
> Now, for the purposes of my application, what I really wanted was a recommendation of
users for a specific item, so my input files are:
> item, user, rating
>
> My input CSV file contains the following stats:
>
> model file: 560,901 records
> item "24441": 31,585 records
> rating contains one of 3 values: 1, 2 or 3
>
> When I ask for a recommendation of users for item "24441", these are the results:
>
> total recommended "users": 50,162
> Elapsed time: 3h 13m
>
> As you can see… this is a very long time processing…  and this all started when
I added "ratings" to the input files.
> Before I was using the recommender with GenericBooleanPrefItemBasedRecommender, and the
process would run in minutes.
> Now with the ratings, I am using the following:
>
>        LogLikelihoodSimilarity similarity = new LogLikelihoodSimilarity(fileDataModel);
>        AllSimilarItemsCandidateItemsStrategy candidateStrategy = new AllSimilarItemsCandidateItemsStrategy(similarity);
>        recommender = new GenericItemBasedRecommender(fileDataModel, similarity, candidateStrategy,
candidateStrategy);
>
> I have another input file with the following stats:
>
> model file: 276,543 records
> item "11205": 5,968 records
> rating contains one of 3 values: 1, 2 or 3
>
> and when I ask for a recommendation of users for item "11205", these are the results:
>
> total recommended "users": 26,083
> Elapsed time: 23m
>
> As you can see, the difference is size is just 2x, but the time difference is 8x !!!
>
> Is this the expected behavior for the recommender to take this long?
> Is there anything I can do to speed up the process?
>
> Thanks
>
> -emilio

Mime
View raw message