mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Hans" <a...@ahans.de>
Subject Re: unknown test data twenty-newsgroups example
Date Thu, 21 Oct 2010 17:40:47 GMT
Hi Neil,

this is a JUnit test, so you need JUnit to run it. I'm using Eclipse IDE
with the JUnit plugin and run it from there. From the commandline you can
use the TestRunner, for instance. See

http://www.junit.org/apidocs/junit/textui/TestRunner.html

You will have to make sure that you have a JUnit jar file in your classpath.


HTH,

Alex


> Thanks Federico and Robin , Got it now.
> Anybody knows the command line parameters running this ?
>
> On Thu, Oct 21, 2010 at 11:02 PM, Federico Castanedo
> <fcastane@inf.uc3m.es>wrote:
>
>> Hello Neil,
>>
>> The file is here:
>>
>> /core/src/test/java/org/apache/mahout/classifier/bayes
>>
>> Regards
>>
>>
>> 2010/10/21 Neil Ghosh <neil.ghosh@gmail.com>:
>> > Thanks Drew
>> > I could not find the file
>> >
>> >
>> http://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/bayes/BayesClassifierSelfTest.java
>> >
>> > In my mahout trunk in this directory
>> >
>> > neil@neil-laptop
>> :~/trunk/core/src/main/java/org/apache/mahout/classifier/bayes$
>> > ll
>> > total 92
>> > drwxr-xr-x 11 neil neil  4096 2010-09-19 12:15 ./
>> > drwxr-xr-x  7 neil neil  4096 2010-09-19 12:15 ../
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 algorithm/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 common/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 datastore/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 exceptions/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 interfaces/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 io/
>> > drwxr-xr-x  6 neil neil  4096 2010-09-19 12:15 mapreduce/
>> > drwxr-xr-x  3 neil neil  4096 2010-09-19 12:15 model/
>> > -rw-r--r--  1 neil neil  8249 2010-09-19 12:15
>> MultipleOutputFormat.java
>> > -rw-r--r--  1 neil neil  1441 2010-09-19 12:15
>> MultipleTextOutputFormat.java
>> > -rw-r--r--  1 neil neil  4133 2010-09-19 12:15 package.html
>> > drwxr-xr-x  6 neil neil  4096 2010-09-19 12:15 .svn/
>> > -rw-r--r--  1 neil neil 13066 2010-09-19 12:15 TestClassifier.java
>> > -rw-r--r--  1 neil neil  7660 2010-09-19 12:15 TrainClassifier.java
>> >
>> > Am I looking at the correct directory ?
>> > Any reference how to run this ?
>> >
>> > On Thu, Sep 30, 2010 at 11:58 PM, Drew Farris <drew@apache.org> wrote:
>> >
>> >> On Thu, Sep 30, 2010 at 10:00 AM, Neil Ghosh <neil.ghosh@gmail.com>
>> wrote:
>> >> >
>> >> > My Question is , If I want to test unknown, documents , do I need
>> it
>> in
>> >> > specific format ? or just keep them (as raw text ) in the input
>> folder
>> >> while
>> >> > testing ?
>> >>
>> >> If I interpret your question correctly, you're saying "I've trained
>> my
>> >> classifier and tested it, now how do I use it in production?". I
>> don't
>> >> know that this is covered by the example.
>> >>
>> >> The unit test, in core/src/test/java --
>> >> org.apache.mahout.classifier.bayes.BayesClassifierSelfTest provides a
>> >> potentially useful example. Take a look at the testSelfTestBayes()
>> >> method.
>> >>
>> >> In general, the operations involved include;
>> >>   Create an instance of Algorithm and Datastore, configure as
>> appropriate .
>> >>   Create an instance of ClassifierContext (named classifier) using
>> >> the Algorithm and Datastore, calling initialize() upon i the context.
>> >>   Generate tokens from your input document (either individual words
>> >> or ngrams based on how the data used to train the model was
>> >> processed).
>> >>   Call classifier.classifyDocument(String[] tokens, String
>> >> defaultCat) this will return a ClassifierResult containing the top
>> >> classifications for the input document ranked by score).
>> >>
>> >> HTH,
>> >>
>> >> Drew
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks and Regards
>> > Neil
>> > http://neilghosh.com
>> >
>>
>
>
>



Mime
View raw message