spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cjwang>
Subject Garbage stats in Random Forest leaf node?
Date Tue, 17 Mar 2015 00:19:43 GMT
I dumped the trees in the random forest model, and occasionally saw a leaf
node with strange stats:

- pred=1.000000 prob=0.800000 imp=-1.000000

Here impurity = -1 and gain = a giant negative number.  Normally, I would
get a None from Node.stats at a leaf node.  Here it printed because Some(s)

	    node.stats match {
	        case Some(s) => println(" imp=%f gain=%f" format(s.impurity,
	        case None => println

Is it a bug?

This doesn't seem happening in the model from DecisionTree, but my data sets
are limited.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message