mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chih-Hsien Wu <chjaso...@gmail.com>
Subject Re: Good centroid generation algorithm for top-down clustering approach
Date Tue, 26 Nov 2013 16:36:21 GMT
I've heard about it but not familiar with it. Does Streaming K generate a
list of centroids for other clustering algorithm?


On Tue, Nov 26, 2013 at 10:55 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Have you looked at the streaming k-means work?  The basic idea is that you
> generate a sketch of the data which you can then cluster in-memory.  That
> lets you use very advanced centroid generation algorithms that require lots
> of processing.
>
>
>
>
> On Tue, Nov 26, 2013 at 6:29 AM, Chih-Hsien Wu <chjasonwu@gmail.com>
> wrote:
>
> > Hi all, I'm trying to clustering text documents via top-down approach. I
> > have experienced both random seed and canopy generation, and have seen
> > their pros and cons. I realize that canopy is great for not known exact
> > cluster numbers; nevertheless, the memory need for canopy is great. I was
> > hoping to find something similar to canopy generation and was wondering
> if
> > there is any other recommendation?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message