mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phoenix Bai <baizh...@gmail.com>
Subject Re: Does clusterdump still support option "--seqFileDir"?
Date Wed, 12 Sep 2012 08:26:04 GMT
in your current mahout version (0.7?) , you should use --input (-i) input
instead of --seqDir.

for the detailed usage, you should check out:

$mahout clusterdump -h

On Wed, Sep 5, 2012 at 3:26 PM, javaboom <javaboom@gmail.com> wrote:

> I've tried to use "clusterdump". I followed this manual
> https://cwiki.apache.org/MAHOUT/cluster-dumper.html
>
> I tried the following command line
>
>  $MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-10
> --pointsDir output/clusteredPoints --output
> $MAHOUT_HOME/examples/output/clusteranalyze.txt
>
> I got a problem i.e., "clusterdump" cannot recognize the option
> "--seqFileDir". Then I checked the help option of the command as follows:
>
>
> ============================================================================
> root@ubuntu:~/trunk/bin# ./mahout clusterdump --help
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB:
> /root/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
> Usage:
>  [--input <input> --output <output> --outputFormat <outputFormat>
> --substring
> <substring> --numWords <numWords> --pointsDir <pointsDir> --samplePoints
> <samplePoints> --dictionary <dictionary> --dictionaryType <dictionaryType>
> --evaluate --distanceMeasure <distanceMeasure> --help --tempDir <tempDir>
> --startPhase <startPhase> --endPhase <endPhase>]
> Job-Specific Options:
>   --input (-i) input                         Path to job input directory.
>   --output (-o) output                       The directory pathname for
> output.
>   --outputFormat (-of) outputFormat          The optional output format to
>                                              write the results as.
>  Options:
>                                              TEXT, CSV or GRAPH_ML
>   --substring (-b) substring                 The number of chars of the
>                                              asFormatString() to print
>   --numWords (-n) numWords                   The number of top terms to
> print
>   --pointsDir (-p) pointsDir                 The directory containing
> points
>                                              sequence files mapping input
>                                              vectors to their cluster.  If
>                                              specified, then the program
> will
>                                              output the points associated
> with
>                                              a cluster
>   --samplePoints (-sp) samplePoints          Specifies the maximum number
> of
>                                              points to include _per_
> cluster.
>                                              The default is to include all
>                                              points
>   --dictionary (-d) dictionary               The dictionary file
>   --dictionaryType (-dt) dictionaryType      The dictionary file type
>                                              (text|sequencefile)
>   --evaluate (-e)                            Run ClusterEvaluator and
>                                              CDbwEvaluator over the input.
> The
>                                              output will be appended to the
>                                              rest of the output at the end.
>   --distanceMeasure (-dm) distanceMeasure    The classname of the
>                                              DistanceMeasure. Default is
>                                              SquaredEuclidean
>   --help (-h)                                Print out help
>   --tempDir tempDir                          Intermediate output directory
>   --startPhase startPhase                    First phase to run
>   --endPhase endPhase                        Last phase to run
> Specify HDFS directories while running on hadoop; else specify local file
> system directories
> 12/09/05 15:17:25 INFO driver.MahoutDriver: Program took 170 ms (Minutes:
> 0.0028333333333333335)
>
> ============================================================================
>
> Could you please help me? How can I solve this problem? Have I used
> different Mahout version?
>
> Thank you in advance
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Does-clusterdump-still-support-option-seqFileDir-tp4005517.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message