mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Baron <adam.j.ba...@gmail.com>
Subject Re: How to classifyan individual file after training
Date Thu, 14 Mar 2013 17:00:18 GMT
Frederic,

Adding the functionality to classify new text on a go-forward basis against
an existing Naïve Bayes model would be very helpful functionality to add to
Mahout.  I found your blog post informative and I'm sure many other
classification users of Mahout have faced similar challenges to what we
have.

Regards,
         Adam

On Wed, Mar 13, 2013 at 6:29 PM, Frederic Dang Ngoc <
frederic.dangngoc@gmail.com> wrote:

> BS TLC <bstlc <at> ymail.com> writes:
>
> >
> > Does anyone have a working piece of code for classifying individual
> documents
> after training the naive
> > bayes model?
> >
> > In the past, the class org.apache.mahout.classifier.Classify did this
> job, but
> i haven't found any
> > equivalent working on the current version.
> > Thanks
> >
> > > That's exactly what I was trying to do, by running
> TestNewsGroups.java, as
> > > I explained in my last post.
> > > Here's the code again with the stack trace. There's something wrong I'm
> > > doing while loading up the model (and I can't load up the Naive Bayes,
> see
> > > code)
> > >
> > > Thanks
> > >
> > > https://gist.github.com/anonymous/4720473
> >
> >
>
> Hi,
>
> I have just written a post on my blog to describe how to train the model
> and use
> it to classify new documents:
>
> https://chimpler.wordpress.com/2013/03/13/using-the-mahout-naive-bayes-
> classifier-to-automatically-classify-twitter-messages/<https://chimpler.wordpress.com/2013/03/13/using-the-mahout-naive-bayes-classifier-to-automatically-classify-twitter-messages/>
>
> To classify new documents, you'll need the following files from HDFS:
> - labelindex
> - model directory with the file naiveBayesModel.bin in it
> - dictionary.file-0 (in the vectors directory)
> - df-count (in the vectors directory)
>
> I use the following code to classify new documents using those files:
> https://github.com/fredang/mahout-naive-bayes-
>
> example/blob/master/src/main/java/com/chimpler/example/bayes/Classifier.java
>
> Hope that it helps.
>
> Frederic
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message