mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Clustering boolean vectors
Date Tue, 10 May 2011 16:22:33 GMT
(Back to user@ for the benefit of the list.)

I see, so you wish to cluster movies -- by attributes or by ratings?
or both? cosine similarity would only make sense in the context of
ratings.

I just want to make sure you don't mean you're producing recommendations.

On Tue, May 10, 2011 at 5:14 PM, Abin Varghese <mail2abin@gmail.com> wrote:
> Hi Sean,
>
> I have ordered the book  (Mahout in action ) today, but that would be
> another 2-3 days, before which I could not look the right API.
> Let me be specific.
>
> I have a set of items vector.
>
> Movie1  - [ 0,1,1,0,0,0,0,0,0,1]
> Movie2 -  [1,0, 0,0,0,0,0,0,0,0]
> Movie3 -  [0, 0,0,1,0,0,0,0,0,1]
> Movie4 -  [1,0, 0,0,0,0,0,1,0,] ..etc
>
> where each of the movies has a attribute vector, denoting the category to
> which it belongs.
> I am looking for the right OOB clustering API, rather writing my own
> Distance Measure / Cosine similarity.
> Or should I write one ?
>
>
> Abin

Mime
View raw message