spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu D <matd...@gmail.com>
Subject Re: Get both feature importance and ROC curve from a random forest classifier
Date Wed, 06 Jul 2016 13:26:50 GMT
well, sounds trivial now ... !
thanks ;-)

2016-07-02 10:04 GMT+02:00 Yanbo Liang <ybliang8@gmail.com>:

> Hi Mathieu,
>
> Using the new ml package to train a RandomForestClassificationModel, you
> can get feature importance. Then you can convert the prediction result to
> RDD and feed it into BinaryClassificationEvaluator for ROC curve. You can
> refer the following code snippet:
>
> val rf = new RandomForestClassifier()
> val model = rf.fit(trainingData)
>
> val predictions = model.transform(testData)
>
> val scoreAndLabels =
>   predictions.select(model.getRawPredictionCol, model.getLabelCol).rdd.map
> {
>     case Row(rawPrediction: Vector, label: Double) => (rawPrediction(1),
> label)
>     case Row(rawPrediction: Double, label: Double) => (rawPrediction,
> label)
>   }
> val metrics = new BinaryClassificationMetrics(scoreAndLabels)
> metrics.roc()
>
>
> Thanks
> Yanbo
>
> 2016-06-15 7:13 GMT-07:00 matd <matdpro@gmail.com>:
>
>> Hi ml folks !
>>
>> I'm using a Random Forest for a binary classification.
>> I'm interested in getting both the ROC *curve* and the feature importance
>> from the trained model.
>>
>> If I'm not missing something obvious, the ROC curve is only available in
>> the
>> old mllib world, via BinaryClassificationMetrics. In the new ml package,
>> only the areaUnderROC and areaUnderPR are available through
>> BinaryClassificationEvaluator.
>>
>> The feature importance is only available in ml package, through
>> RandomForestClassificationModel.
>>
>> Any idea to get both ?
>>
>> Mathieu
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Get-both-feature-importance-and-ROC-curve-from-a-random-forest-classifier-tp27175.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Mime
View raw message