mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Peut <>
Subject Mahout 0.8 Random Forest Accuracy
Date Fri, 18 Oct 2013 06:48:38 GMT
Hi all,

I'm using the random forest implementation in Mahout 0.8 to perform
classification (org.apache.mahout.classifier.df.mapreduce.BuildForest and
org.apache.mahout.classifier.df.mapreduce.TestForest). I've run the
classifier multiple times with different parameters and different data
splits, and consistently get accuracy of ~0.9.

I've previously used R's RRF package with the exact same data and I
consistently get accuracy of ~0.95, which is a fair bit higher than the
Mahout results. I've been unable to figure out why the classifiers perform
differently with the same data and the same parameters.

Has anyone found that Mahout's random forest doesn't perform as well as
other implementations? If not, is there any reason why it wouldn't perform
as well?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message