mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: OnlineLogisticRegression sgd, calculating confidence value
Date Mon, 28 Jul 2014 20:39:15 GMT

Your impression is correct for classifyFull. This behavior indicates that the classifier has
extremely high confidence.  

Increasing lambda should eventually make the scores degrade to equal scores for each category.
 

Since that isn't happening I think that there may be something else going on. Have you tested
with synthetic data?  Can you post sample code. 

Sent from my iPhone

> On Jul 28, 2014, at 13:53, Nicholas Demusz <nicholas.demusz@gmail.com> wrote:
> 
> Hi,
> I am trying to do some classification with Mahout's
> OnlineLogisticRegression, I've built a model and have it trained on 5
> categories of interest to me. I however was under the impression that the
> classify() and classifyFull() methods would return a vector of floats that
> totaled to 1.0 .. However I get a vector back and it only has a 1 in the
> index position of the category that it thinks it's supposed to be in. Is
> this the normal behavior? I have about 500 training items for each
> category. I've played with the value of lambda some but it doesn't change.
> 
> If this is the intended outcome, could someone point me to a way to provide
> a confidence value for items that I classify, or should I be looking at a
> recommender?
> 
> My goal is to have some sort of confidence score to indicate the level of
> certainty that this is what it says it is, as well as put the exemplar data
> into a category.

Mime
View raw message