mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Mahout KMeans generate doubled cluster number than my initial K setting
Date Fri, 12 Oct 2012 09:38:06 GMT


I am a beginner in Mahout, I use Mahout 0.8 and followed the tutorial in


First, I use :

`mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job -i testdata
-o output -t1 20 -t2 50 -k 5 -x 20 -ow`


then use clusterdump to extract the cluster-centers: 


    mahout clusterdump --input output/clusters-20-final --output


after this, in the file: 


    VL-585{n=50 c=[29.832, 29.589, 29.405, 28.516, 29.600, ..] r=[3.152,
3.518, 3.292, .]}


    VL-591{n=197 c=[29.984, 29.681,.] r=[3.602, 3.558, 3.364,.]}


    VL-595{n=203 c=[..] r=[..]}


    VL-597{n=61 c=[..] r=[..]}


    VL-599{n=43 c=[..] r=[..]}


    VL-585{n=1 c=[..] r=[..]}


    VL-591{n=27 c=[..] r=[..]}


    VL-595{n=1 c=[..] r=[..]}


    VL-597{n=1 c=[..] r=[..]}


    VL-599{n=16 c=[..] r=[..]}



It seems the kmean generates 10 clusters, but my initial setting for k is 5.


I also tried other k, it always generate doubled clusters.


Can anyone help me with this? Thanks a lot!



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message