mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <>
Subject RE: Cluster Dumper no output not shown
Date Tue, 01 Mar 2011 22:40:10 GMT
You need to add the -cl (--clustering) option to get your input points classified (clustered)
by the clusters in your final clusters-n directory. This output will appear in a "clusteredPoints"
directory. (Since this classification step is not always desired and can take a while it is
optional). The clusterdumper should then give you the output you are seeking. 

-----Original Message-----
From: mwkhan [] 
Sent: Tuesday, March 01, 2011 1:03 PM
Subject: Cluster Dumper no output not shown


First I ran k-means algorithm using the article Introduction to Apache
Mahout with following arguments:

<java classname="org.apache.mahout.clustering.kmeans.KMeansDriver"
          fork="true" maxmemory="738m">
      <classpath refid="runtime.classpath"/>
      <arg value="--input"/>
      <arg value="${wiki.dir}/n2/part-full.txt"/>
      <arg value="--clusters"/>
      <arg value="${wiki.dir}/n2/k-output/clusters-in"/>
      <arg value="--k"/>
      <arg value="10"/>
      <arg value="--output"/>
      <arg value="${wiki.dir}/n2/k-output"/>
      <arg value="--distance"/>
      <arg value="org.apache.mahout.utils.CosineDistanceMeasure"/>
      <arg value="--convergence"/>
      <arg value="0.01"/>
      <arg value="--overwrite"/>

Now i have the following directories in my "k-output" folder on local
machine: clusters-0,clusters-1,clusters-2,clusters-3,clusters-4,clusters-in
and points

Then when i am trying to run cluster-dumper utility using Standalone Java

$ bin/mahout clusterdump --seqFileDir

i got the following output:

no HADOOP_HOME set, running locally

Mar 1, 2011 8:57:49 PM org.slf4j.impl.JCLLoggerAdapter info

INFO: Command line arguments: {--dictionaryType=text, --endPhase=2147483647,
--startPhase=0, --tempDir=temp}

Mar 1, 2011 8:57:49 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Program took 332 ms

Why i am not getting clustering data as output???

I am running this commands through cygwin installed on windows machine.

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message