mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Maillot <nmail...@gmail.com>
Subject Re: question about twenty newsgroup example
Date Sun, 11 Jul 2010 17:20:39 GMT
Mishari,


I have just run the newsgroup tutorial.

You have to change two minor details to get it working:

-change mahout-examples-0.1.job to mahout-examples-0.3.job (if like me
you are testing with mahout 0.3)

-add -source hdfs option


For me it was enough.


Hope this will help,


Cheers,


Nicolas





Hi all,
I am a new user to Mahout. I tried the quickstart first with 20 newsgroup
example to familiarize myself with Mahout. I checked this
link:https://cwiki.apache.org/confluence/display/MAHOUT/Twenty+Newsgroups
and
followed the steps and everything was fine until i reached the testing step
over hadoop with the command stated as follows:

$HADOOP_HOME/bin/hadoop \
    jar \
    $MAHOUT_HOME/examples/target/mahout-examples-0.1.job \
    org.apache.mahout.classifier.bayes.TestClassifier \
    -p newsmodel \
    -t work/20news-input \
    -ng 3 \
    -type bayes


First, I found mistakes in the command i guess "p" should be "m" and "t"
should be "d". I corrected that but i kept receiving the following message
every time i execute the command:

Usage:

 [--defaultCat <defaultCat> --testDir <testDir> --encoding
<encoding>
--gramSize <gramSize> --model <model> --classifierType
<classifierType>
--dataSource <dataSource> --help --method <method> --verbose --alpha
<a>]
Options

  --defaultCat (-default) defaultCat         The default category
Default
                                             Value:
unknown
  --testDir (-d) testDir                     The directory where test
documents
                                             resides
in
  --encoding (-e) encoding                   The file encoding.  Defaults
to

UTF-8
  --gramSize (-ng) gramSize                  Size of the n-gram. Default
Value:

1
  --model (-m) model                         The path on HDFS / Name of
Hbase
                                             Table as defined by the
-source

parameter
  --classifierType (-type) classifierType    Type of classifier:
bayes|cbayes.
                                             Default Value:
bayes
  --dataSource (-source) dataSource          Location of model:
hdfs|hbase
                                             Default Value:
hdfs
  --help (-h)                                Print out
help
  --method (-method) method                  Method of
Classification:
                                             sequential|mapreduce.
Default
                                             Value:
sequential
  --verbose (-v)                             Output which values were
correctly
                                             and incorrectly
classified
  --alpha (-a) a                             Smoothing parameter Default
Value:
                                             1.0


eventhough my command structure is correct and i have executed the training
command before and it ran perfectly creating the newsmodel directory. Any
clue why i am receiving this message and being able to run the test command?


Thanks in advance!

-mish

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message