spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: [mllib] Decision Tree - prediction probabilites of label classes
Date Thu, 22 Jan 2015 10:50:14 GMT
You are right that this isn't implemented. I presume you could propose
a PR for this. The impurity calculator implementations already receive
category counts. The only drawback I see is having to store N
probabilities at each leaf, not 1.

On Wed, Jan 21, 2015 at 3:36 PM, Zsolt Tóth <> wrote:
> Hi,
> I use DecisionTree for multi class classification.
> I can get the probability of the predicted label for every node in the
> decision tree from node.predict().prob(). Is it possible to retrieve or
> count the probability of every possible label class in the node?
> To be more clear:
> Say in Node A there are 4 of label 0.0, 2 of label 1.0 and 3 of label 2.0.
> If I'm correct predict.prob() is 4/9 in this case. I need the values 2/9 and
> 3/9 for the 2 other labels.
> It would be great to retrieve the exact count of label classes ([4,2,3] in
> the example) but I don't think thats possible now. Is something like this
> planned for a future release?
> Thanks!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message