spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Bradley <jos...@databricks.com>
Subject Re: [mllib] Decision Tree - prediction probabilites of label classes
Date Sun, 25 Jan 2015 01:16:34 GMT
There is a JIRA...but not a PR yet.  Here's the JIRA:
https://issues.apache.org/jira/browse/SPARK-3727

I'm not aware of current work on it, but I agree it would be nice to have!
Joseph

On Thu, Jan 22, 2015 at 2:50 AM, Sean Owen <sowen@cloudera.com> wrote:

> You are right that this isn't implemented. I presume you could propose
> a PR for this. The impurity calculator implementations already receive
> category counts. The only drawback I see is having to store N
> probabilities at each leaf, not 1.
>
> On Wed, Jan 21, 2015 at 3:36 PM, Zsolt Tóth <toth.zsolt.bme@gmail.com>
> wrote:
> > Hi,
> >
> > I use DecisionTree for multi class classification.
> > I can get the probability of the predicted label for every node in the
> > decision tree from node.predict().prob(). Is it possible to retrieve or
> > count the probability of every possible label class in the node?
> > To be more clear:
> > Say in Node A there are 4 of label 0.0, 2 of label 1.0 and 3 of label
> 2.0.
> > If I'm correct predict.prob() is 4/9 in this case. I need the values 2/9
> and
> > 3/9 for the 2 other labels.
> >
> > It would be great to retrieve the exact count of label classes ([4,2,3]
> in
> > the example) but I don't think thats possible now. Is something like this
> > planned for a future release?
> >
> > Thanks!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message