mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kostas_new <>
Subject How create a recommended system?
Date Fri, 21 Sep 2012 12:48:16 GMT

I am programming a recommendation system in terms of a course project in
order to propose activities for a specific person.
I have installed mahout and handoop in order to succeed that.

The attributes which enroll important role in the recommendation system are
the followings:
1) all the attributes for each one person (e.g. age, gender, his/her
preferences in different types of activity)
2) The core activities (type of activities, target_group(numerical
The numeric attribute it is not a problem because is only a number. As a
result i would I would like to declare a "distance function" between the
different activities, for example the relationship distance between the
football and basketball should be strong, because are parts of the sports
category. Otherwise the distance between basketball and opera should be
larger. Except of the unique characterization of each one of the activities,
I would prefer to characterize each activity by many types, for example
activity X -> is an opera with education character. That is a multi
dimension of nominal attributes.


How do you recommend me to implement the recommendation algorithm process?!
One the one hand, I am thinking to do the clustering for the users. The
clustering must take under consideration for example the age, e.g. 33 years
old , and the preferences, for example the user 1023 prefers to go to B1
(activity type = B2), in that point the creation of the vector is a headache
for me because I have to measure the distance between the different
activities, counting as well the user’s preferences.(*Q2*).

On the other hand because I know the exact number and features of the
activities I don’t think that it is needed to implement a clustering for

As the last step, I want to use the collaborative filtering for my
recommendations. For example the input table will follow the format:
User_id 	activity_id	Preference (0..1)
100	         500	        0.5
200	         300	        0.9

I know that I could use this table only for my recommendations, but I want
to take advantage on the user preferences and of the dependencies between
the different types of activities which hypothetically could be siblings.

Thank you very much for your time. Unfortunately I have spent many months in
order to find a solution.

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message