spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jatinpreet <jatinpr...@gmail.com>
Subject Re: Naive Baye's classification confidence
Date Thu, 20 Nov 2014 12:04:10 GMT
Sean,

My last sentence didn't come out right. Let me try to explain my question
again.

For instance, I have two categories, C1 and C2. I have trained 100 samples
for C1 and 10 samples for C2.

Now, I predict two samples one each of C1 and C2, namely S1 and S2
respectively. I get the following prediction results,

S1=> Category: C1, Probability: 0.7
S2=> Category: C2, Probability: 0.04

Now, both the predictions are correct but their probabilities are far apart.
Can I improve the prediction probability by taking the 10 samples I have of
C2 and replicating each of them 10 times making the total count equal to 100
which is same as C1.

Can I expect this to increase the probability of sample S2 after training
the new set? Is this a viable approach? 

Thanks,
Jatin



-----
Novice Big Data Programmer
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Baye-s-classification-confidence-tp19341p19366.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message