mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: recommendation based on user preference
Date Wed, 02 Jul 2014 23:06:48 GMT
If you are looking to recommend a similar neighborhood based on the characteristics of some
other neighborhood (the user’s current one) so you wouldn’t use collaborative filtering.
This is a metadata recommender based on similarity of neighborhoods not a collection of user

The easiest and fastest would be to use a search engine but I’ll leave that for now since
it doesn’t account for feature weights as well.

create a table like this:
Neighborhood 	Gym Cafe	Bookstore
Downtown	15	50		0
Midtown		30	100		10

You will need to convert the row IDs into sequential ints, which Mahout uses for IDs. Then
read them into a sequenceFile creating a Distributed Row Matrix, which has Key -  Value pairs.
Keys = the integer neighborhood IDs, the Value is a Vector (a sort of list) of column integer
IDs with the counts.

Then run rowsimilarity on the DRM. This is the CLI but there is also a Driver you can call
from your code.

There are some data prep issues you will have since larger neighborhoods will have higher
counts. An easy thing to do would be to normalize the counts by something like population
or physical size so you get cafes per resident or per sq mile or some other ratio.

The result of the rowsimilarity job will be another DRM of key = neightborhood ID, values
= Vector of similar neighborhoods (by integer ID) with a strength of similarity. Sort the
vector by strength and you’ll have an ordered list of similar neighborhoods for each neighborhood.

On Jun 30, 2014, at 12:48 PM, Edith Au <> wrote:


I am a newbie and am looking for some guidance to implement my
recommender.  Any help would be greatly appreciated.  I have a small
data set of location information with the following fields:
neighborhood, amenities, and counts.  For example:

Downtown          Gym 15
Downtown          Cafe 50
Midtown             Gym 30
Midtown             Cafe 100
Midtown             Bookstore 10
Financial Dist

so on and so forth.  I want to recommend a neighborhood for a user to
reside base on the amenities (and some other metrics) in his/her
current neighborhood.    My understanding is that model-based
recommendation would be a good fit for the job.  If I am on the right
track,  is there a experimental/beta recommender I can try?

If there is no such recommender yet, can I still use Mahout for my
project?  For example, can I implement my own Similarity which only
computes the similarity between one user's preference to a set of
neighborhood?  If I understand Mahout correctly, User/Item Similarity
would do N x (N-1) pair of comparisons as oppose to 1 x N comparisons.
In my example, User/Item Similarity would compare between Downtown,
Midtown, Fin Dist -- which would be a waste in computation resources
since the comparisons are not needed.

Thanks in advance for your help.


View raw message