mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikash Gupta <bikash.gupt...@gmail.com>
Subject Re: [Edit] Approach for Clustering Data
Date Mon, 17 Feb 2014 20:25:08 GMT
Ok...so UserId is not a good field for this combination, but if I want
User Clustering, what should be combination(just for
understanding).....

On Tue, Feb 18, 2014 at 1:44 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> On Mon, Feb 17, 2014 at 9:00 AM, Bikash Gupta <bikash.gupta11@gmail.com>wrote:
>
>> Let say I am clustering users, I am providing their profile data to
>> discover similarity between two user.
>>
>> So my input would be [UserId, Location, Age, Gender, Time Created ]
>>
>> Now if my UserId length is of minimum 10 characters which is
>> comparative very large number than other categorical data.
>>
>
> User id is not a good field for clustering.
>
> Location is fine if you want geo-graphical clsutering.
>
> Location + age + gender is fine for geo-demo-graphical clustering.
>
> Adding time created might give a tiny bit of insight.
>
> But these fields are not going to lead to great insights.



-- 
Thanks & Regards
Bikash Kumar Gupta

Mime
View raw message