spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: [mllib] Decision Tree - prediction probabilites of label classes
Date Thu, 22 Jan 2015 10:50:14 GMT
You are right that this isn't implemented. I presume you could propose
a PR for this. The impurity calculator implementations already receive
category counts. The only drawback I see is having to store N
probabilities at each leaf, not 1.

On Wed, Jan 21, 2015 at 3:36 PM, Zsolt Tóth <toth.zsolt.bme@gmail.com> wrote:
> Hi,
>
> I use DecisionTree for multi class classification.
> I can get the probability of the predicted label for every node in the
> decision tree from node.predict().prob(). Is it possible to retrieve or
> count the probability of every possible label class in the node?
> To be more clear:
> Say in Node A there are 4 of label 0.0, 2 of label 1.0 and 3 of label 2.0.
> If I'm correct predict.prob() is 4/9 in this case. I need the values 2/9 and
> 3/9 for the 2 other labels.
>
> It would be great to retrieve the exact count of label classes ([4,2,3] in
> the example) but I don't think thats possible now. Is something like this
> planned for a future release?
>
> Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message