mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Outlier detection/Pruning
Date Fri, 06 Dec 2013 03:14:25 GMT
You should move to 0.8 and explore ball k-means.




On Tue, Dec 3, 2013 at 8:44 PM, Prabhakar Srinivasan <
prabhakar.srinivasan@gmail.com> wrote:

> Hello
> I am using Mahout 0.7 currently and this question is pertaining to that
> version. I am using Canopy clustering (CanopyDriver class)  first to
> determine the optimal number of clusters that best fits the dataset and
> passing that information as parameter to Kmeans clustering (kmeansDriver
> class).
>
> Regards
> Prabhakar
>
>
> On Tue, Dec 3, 2013 at 6:00 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > Can you be more specific about which code you are asking about?
> >
> > The ball k-means implementation provides a capability somewhat like this,
> > but perhaps in a more clearly defined way.
> >
> >
> > On Tue, Dec 3, 2013 at 9:34 AM, Prabhakar Srinivasan <
> > prabhakar.srinivasan@gmail.com> wrote:
> >
> > > Hello!
> > > Can someone point me to some explanatory documentation for Outlier
> > > Detection & Removal in Clustering in Mahout. I am unable to understand
> > the
> > > internal mechanism of outlier detection just by reading the Javadoc:
> > > clusterClassificationThreshold Is a clustering strictness / outlier
> > removal
> > > parameter. Its value should be between 0 and 1. Vectors having pdf
> below
> > > this value will not be clustered.
> > >
> > > What does the pdf represent?
> > >
> > > Thanks
> > > Prabhakar
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message