mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Harrington <ch...@heystaks.com>
Subject MiA NewsKMeansClustering Example Help
Date Wed, 30 Jan 2013 12:10:07 GMT
Hi all, 

I'm new to Mahout and I've been going through the MiA book, lately I've been trying Chapter
10's example of NewsKMeansClustering as it looks like a good starting point for my own stuff
but I've run into a problem just trying to run and view the output.

I'm trying to view the output of running the java file via the cluster dump utility but all
I get out of it is an empty text file.

I'm using MiA-mahout-0.6 and mahout-distribution-0.6. This is the process I went trough to
get to this point.
Get the reuters data and put it into seqfiles.  (I issue these commands to bin/mahout in the
mahout-distribution-0.6 project)
mvn -e -q exec:java -Dexec.mainClass="org.apache.lucene.benchmark.utils.ExtractReuters" -Dexec.args="reuters/
reuters-extracted/"
bin/mahout seqdirectory -c UTF-8 -i examples/reuters-extracted/ -o reuters-seqfiles
I (manually - drag and drop) move the  seq files to MiA (0.6) project into the folder reuters-seqfiles.
I then run MiA example of NewsKMeansClustering from chapter 10 which results in a folder newsClusters
being created and populated with various files (clusters folder, dictionary.file-0, centroids
folder, etc)
There doesn't appear to be any unusual errors in the console
2013-01-30 11:15:42.593 java[11011:1903] Unable to load realm info from SCDynamicStore
SLF4J: The requested version 1.5.11 by your slf4j binding is not compatible with [1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
2013-01-30 11:15:45 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
. (same as above line)
.
.
2013-01-30 11:16:55 NativeCodeLoader [WARN] Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
2013-01-30 11:16:56 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
.(same as above line)
.
.
I then run the cluster dump command to create an output.txt file.
../mahout-distribution-0.6/bin/mahout clusterdump -s newsClusters/clusters/clusters-19/ -o
output.txt -d newsClusters/dictionary.file-0 -dt sequencefile -n 10
but all this does is create an empty text file.

Any help would be much appreciated.

Thanks,
Chris








Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message