mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paritosh ranjan <paritoshranj...@gmail.com>
Subject Re: Clusterdump Output Question
Date Sun, 07 Oct 2012 12:35:41 GMT
The top terms come from the centroid of the cluster. These values are the
term frequencies.

On Sun, Oct 7, 2012 at 5:38 PM, jung hoon sohn <jsohn57@gmail.com> wrote:

> Hello,
> I used k-means algorithm to cluster the text terms in the documents
> according to the cosine distance measure.
> It ran successfully and when we ran the clusterdump utility to see the top
> terms per each clusters,
> I get the output such as
>
>       Top Terms:
>
>             hello    =>     21.8977799999
>             you     =>     11.9284304939
>             ....
>
> I am guessing the value next to the each terms are cosine distance values
> but not very sure about it.
> Does anyone know specifically what does the value represent?
>
> Thanks.
>
> Jung Hoon Sohn
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message