mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Editing Dictionary Vector Generated
Date Fri, 04 Oct 2013 10:51:01 GMT
Why do you say that this is unacceptable?

If the phrase is the most common way that the word English is used, this isn't such a bad

In general, with machine learning, the idea is to let the data speak. If the data say something
you don't like, you have to be careful about contradicting it. 

That said, you might be happier with something other than naive bayes classifieds (which I
am guessing you are using). For instance, with regularized logistic regression, if the bigram
is sufficiently predictive then the model will prefer to put zero weight on the constituent

Sent from my iPhone

On Oct 4, 2013, at 9:50, Puneet Arora <> wrote:

> anti is marked as negative which also acceptable but
> it is also taking English as negative which is not acceptable

View raw message