spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Naive Baye's classification confidence
Date Thu, 20 Nov 2014 11:10:07 GMT
Yes, certainly you need to consider the problem of how and when you
update the model with new info. The principle is the same. Low or high
posteriors aren't wrong per se. It seems normal in fact that one class
is more probable than others, maybe a lot more.

On Thu, Nov 20, 2014 at 10:31 AM, jatinpreet <> wrote:
> Thanks a lot Sean. You are correct in assuming that my examples fall under a
> single category.
> It is interesting to see that the posterior probability can actually be
> treated as something that is stable enough to have a constant threshold
> value on per class basis. It would, I assume, keep changing for a sample as
> I add/remove documents in the training set and thus warrant corresponding
> change in the threshold.
> Also, I have seen the class prediction probabilities to range from 0.003 to
> 0.8 for correct classifications in my sample data. This is a wide spectrum,
> so is there a way to change that? Maybe by replicating the samples for the
> classes I get low confidence but accurate classification for.
> Thanks,
> Jatin
> -----
> Novice Big Data Programmer
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message