mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "TheGeorge1918 ." <zhangxuan1...@gmail.com>
Subject the label of output of random forest in mahout
Date Wed, 22 Jul 2015 08:35:59 GMT
Hi all,

I have a question about the label in the output of random forest. Suppose I
do a binary classification with label 0 and 1. In my data description file,
I have something like
{"values":["1","0"],"label":true,"type":"categorical"}. The label 1 is in
index 0 and label 2 in index 1. Is it possible in the output file .out the
label is swapped?

I checked the source code of mahout Classifier
(mr/src/main/java/org/apache/mahout/classifier/df/mapreduce/Classifier.jara).
In the "parseOutput" function, it directly outputs the result into file
without trying to get the right label.

In the TestForest
(examples/src/main/java/org/apache/mahout/classifier/df/mapreduce/TestForest.jara).
If I specify the -a parameter. Then, it will output confusion matrix.
There, it looks like the right label is obtained by calling dataset
.getLabelString().

So, my conclusion is that the confusion matrix is always right (the user
provided label is used to compute). However, the output of prediction could
have a different label compared to the user supplied label. Is it right?

Thanks a lot

Best

Xuan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message