mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Anil <robin.a...@gmail.com>
Subject Re: Multiclass classifier - hunting for a small one
Date Mon, 17 Jan 2011 06:37:47 GMT
I would say you dont need any fancy stuff

Complementary Naive bayes classifier. Put high frequency words(stop words)
from various languages into bayes format. Train the model(very small model
gets generated). The classifier is surprisingly accurate. I have used it for
many projects and have never needed to tweak anything

Robin


On Mon, Jan 17, 2011 at 8:50 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> TIKA-369 is still open.  Apparently the new code isn't committed yet.
>
> On Sun, Jan 16, 2011 at 7:15 PM, Lance Norskog <goksron@gmail.com> wrote:
>
> > https://issues.apache.org/jira/browse/SOLR-1979
> >
> > Nice.  How effective is the Tika language stuff?
> >
> > On Fri, Jan 14, 2011 at 3:13 PM, Grant Ingersoll <gsingers@apache.org>
> > wrote:
> > > And, there is a patch that is close to being committed for Solr.
> > >
> > > On Jan 14, 2011, at 11:33 AM, Ted Dunning wrote:
> > >
> > >> Tika has a classifier which I think has been updated to use
> competitive
> > >> techniques.
> > >>
> > >> See https://issues.apache.org/jira/browse/TIKA-369 for details.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message