mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prabhakar Srinivasan <prabhakar.sriniva...@gmail.com>
Subject Re: Outlier detection/Pruning
Date Wed, 04 Dec 2013 04:44:13 GMT
Hello
I am using Mahout 0.7 currently and this question is pertaining to that
version. I am using Canopy clustering (CanopyDriver class)  first to
determine the optimal number of clusters that best fits the dataset and
passing that information as parameter to Kmeans clustering (kmeansDriver
class).

Regards
Prabhakar


On Tue, Dec 3, 2013 at 6:00 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Can you be more specific about which code you are asking about?
>
> The ball k-means implementation provides a capability somewhat like this,
> but perhaps in a more clearly defined way.
>
>
> On Tue, Dec 3, 2013 at 9:34 AM, Prabhakar Srinivasan <
> prabhakar.srinivasan@gmail.com> wrote:
>
> > Hello!
> > Can someone point me to some explanatory documentation for Outlier
> > Detection & Removal in Clustering in Mahout. I am unable to understand
> the
> > internal mechanism of outlier detection just by reading the Javadoc:
> > clusterClassificationThreshold Is a clustering strictness / outlier
> removal
> > parameter. Its value should be between 0 and 1. Vectors having pdf below
> > this value will not be clustered.
> >
> > What does the pdf represent?
> >
> > Thanks
> > Prabhakar
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message