mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gustavo Enrique Salazar Torres <>
Subject Mahout Kmeans
Date Fri, 07 Sep 2012 20:54:00 GMT
Hi there:

I'm trying to finish an improvement to the Kmeans algorithm but I first
need to get it run in order to compare results.
But running the script I get this error:

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using
/home/gustavo/Desktop/yandex_data/hadoop- and
12/09/07 17:47:43 INFO common.AbstractJob: Command line arguments:
{--clustering=null, --clusters=[./reuters-kmeans-clusters],
--input=[./reuters_out_seqdir_kmeans/tfidf-vectors], --maxIter=[10],
--method=[mapreduce], --numClusters=[20], --output=[./reuters-kmeans],
--overwrite=null, --startPhase=[0], --tempDir=[temp]}
12/09/07 17:47:44 INFO common.HadoopUtil: Deleting reuters-kmeans-clusters
12/09/07 17:47:44 INFO util.NativeCodeLoader: Loaded the native-hadoop
12/09/07 17:47:44 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
12/09/07 17:47:44 INFO compress.CodecPool: Got brand-new compressor
12/09/07 17:47:44 INFO kmeans.RandomSeedGenerator: Wrote 20 Klusters to
12/09/07 17:47:44 INFO kmeans.KMeansDriver: Input:
reuters_out_seqdir_kmeans/tfidf-vectors Clusters In:
reuters-kmeans-clusters/part-randomSeed Out: reuters-kmeans Distance:
12/09/07 17:47:44 INFO kmeans.KMeansDriver: convergence: 0.5 max
Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable
Input Vectors: {}
12/09/07 17:47:44 INFO compress.CodecPool: Got brand-new decompressor
Exception in thread "main" java.lang.IllegalStateException: No input
clusters found in reuters-kmeans-clusters/part-randomSeed. Check your -c

As you can see the initial clusters are being created but for a reason I
don't understand why they are being found.
Below is the 'cat' command on the part file containing clusters:

$ dfs -cat reuters-kmeans-clusters/part-randomSeed*�W3K�E�߇H��Vgustavo

Can anyone help me please?

Gustavo Salazar

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message