mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From javaboom <javab...@gmail.com>
Subject Does clusterdump still support option "--seqFileDir"?
Date Wed, 05 Sep 2012 07:26:01 GMT
I've tried to use "clusterdump". I followed this manual
https://cwiki.apache.org/MAHOUT/cluster-dumper.html 

I tried the following command line 

 $MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-10
--pointsDir output/clusteredPoints --output
$MAHOUT_HOME/examples/output/clusteranalyze.txt 

I got a problem i.e., "clusterdump" cannot recognize the option
"--seqFileDir". Then I checked the help option of the command as follows:

============================================================================
root@ubuntu:~/trunk/bin# ./mahout clusterdump --help
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /root/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
Usage:                                                                          
 [--input <input> --output <output> --outputFormat <outputFormat>
--substring   
<substring> --numWords <numWords> --pointsDir <pointsDir> --samplePoints
       
<samplePoints> --dictionary <dictionary> --dictionaryType <dictionaryType>
     
--evaluate --distanceMeasure <distanceMeasure> --help --tempDir <tempDir>    
  
--startPhase <startPhase> --endPhase <endPhase>]                             
  
Job-Specific Options:                                                           
  --input (-i) input                         Path to job input directory.       
  --output (-o) output                       The directory pathname for
output. 
  --outputFormat (-of) outputFormat          The optional output format to      
                                             write the results as.  Options:    
                                             TEXT, CSV or GRAPH_ML              
  --substring (-b) substring                 The number of chars of the         
                                             asFormatString() to print          
  --numWords (-n) numWords                   The number of top terms to
print   
  --pointsDir (-p) pointsDir                 The directory containing points    
                                             sequence files mapping input       
                                             vectors to their cluster.  If      
                                             specified, then the program
will   
                                             output the points associated
with  
                                             a cluster                          
  --samplePoints (-sp) samplePoints          Specifies the maximum number of    
                                             points to include _per_
cluster.   
                                             The default is to include all      
                                             points                             
  --dictionary (-d) dictionary               The dictionary file                
  --dictionaryType (-dt) dictionaryType      The dictionary file type           
                                             (text|sequencefile)                
  --evaluate (-e)                            Run ClusterEvaluator and           
                                             CDbwEvaluator over the input. 
The 
                                             output will be appended to the     
                                             rest of the output at the end.     
  --distanceMeasure (-dm) distanceMeasure    The classname of the               
                                             DistanceMeasure. Default is        
                                             SquaredEuclidean                   
  --help (-h)                                Print out help                     
  --tempDir tempDir                          Intermediate output directory      
  --startPhase startPhase                    First phase to run                 
  --endPhase endPhase                        Last phase to run                  
Specify HDFS directories while running on hadoop; else specify local file       
system directories                                                              
12/09/05 15:17:25 INFO driver.MahoutDriver: Program took 170 ms (Minutes:
0.0028333333333333335)
============================================================================

Could you please help me? How can I solve this problem? Have I used
different Mahout version?

Thank you in advance




--
View this message in context: http://lucene.472066.n3.nabble.com/Does-clusterdump-still-support-option-seqFileDir-tp4005517.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Mime
View raw message