mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abdelhakim Deneche <>
Subject Re: Decision Forest/Partial Implementation TestForest Error
Date Thu, 06 Sep 2012 06:22:46 GMT
Hi Nick,

This is not a memory problem, the classifier tries to load the trained forest but it's getting
some unexpected values. This problem never occured before! Could the forest files be corrupted

Try training the forest once again, and this time use the sequential classifier (don't use
the -mr parameter) and see if the problem still occurs.

On 5 sept. 2012, at 23:00, Nick Jordan <> wrote:

> Hello All,
> I'm playing around with decision forests using the partial
> implementation and my own data set.  I am getting an error with
> TestForest, but only for certain forests that I'm building with
> BuildForest.  Using the same descriptor and same build and test data
> sets I get no error if I set mapred.max.split.size=1890528 which is
> roughly 1/100th the size of the build data set.  I can build the
> forest and test the remaining data and get the results with no
> problem.  When I change the split size to 18905280, everything still
> appears to work fine when building the forest, but when I then try to
> test the remaining data I get the error below.
> I've dug around the code a little, but nothing stood out as to why the
> array would go out of bounds at that specific value.  One solution is
> to obviously not create partitions that large, but if it was a problem
> with me running out of memory I would have expected an out of memory
> error and not an index past the size the bounds of an array.  I'd
> obviously prefer larger partitions and thus less of them and can move
> running this job to something like EMR which should allow me to have
> more memory, but I want to understand the nature of the error.
> For what it is worth I'm running this on hadoop-1.0.3 and mahout-0.8-SNAPSHOT
> Thanks.
> --
> 12/09/05 17:52:09 INFO mapred.JobClient: Task Id :
> attempt_201209031756_0008_m_000000_0, Status : FAILED
> java.lang.ArrayIndexOutOfBoundsException: 946827879
>        at
>        at org.apache.mahout.classifier.df.DecisionForest.readFields(
>        at
>        at org.apache.mahout.classifier.df.DecisionForest.load(
>        at org.apache.mahout.classifier.df.mapreduce.Classifier$CMapper.setup(
>        at
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(
>        at
>        at org.apache.hadoop.mapred.Child$
>        at Method)
>        at
>        at
>        at org.apache.hadoop.mapred.Child.main(

View raw message