spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈哲 <czhenj...@gmail.com>
Subject How to Improve Random Forest classifier accuracy
Date Thu, 18 Aug 2016 08:31:57 GMT
Hi All
   I using spark ml Random Forest classifier, I have only two label
categories (1, 0) ,about 30 features and data size over 100, 000. I run the
spark JavaRandomForestClassifierExample code, the model came out with the
results (I make some change, show more detail result):
Test Error = 0.022321731460750338
Prediction results label = 1 count:951
Prediction results label = 0 count:13788
Prediction results predictedLabel = 1 and label = 1 count:682
Prediction results predictedLabel = 1 and label = 0 count:60
Prediction results predictedLabel = 0 and label = 1 count:269
Prediction Right = 0.7171398527865405
Prediction Miss= 0.28286014721345953
Prediction Wrong= 0.004351610095735422

I need to some advice about how to improve the accuracy , I tried to change
classifier attributes , some like maxdepth, maxbins but doesn't change much.
do I have to give more features ? or there is other ways to improve this ?

Thanks

Mime
View raw message