mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Singh <vi...@vinodsingh.com>
Subject Re: Java Heap Error: ItemSimilarityJob
Date Wed, 06 Jun 2012 10:23:23 GMT
Child heap size can be increased by passing command line options as well.
See the example given below-

-Dmapred.map.child.java.opts=-Xmx6100m
-Dmapred.reduce.child.java.opts=-Xmx6100m

Thanks,
Vinod

http://blog.vinodsingh.com/

On Wed, Jun 6, 2012 at 3:20 PM, Sean Owen <srowen@gmail.com> wrote:

> You need to increase the size of the children's heap.
> mapred.child.java.opts can be set to -Xmx4g for example. This is
> usually put in mapred-site.xml.
>
> Sampling does decrease the size of the intermediate outputs; probably
> not the final output so much. But this is not your problem. You are
> running out of heap on the workers.
>
> You should definitely use more than one reducer! It's really up to
> you, says Hadoop, to specify this, use -Dmapred.reduce.tasks=10 or
> whatever is appropriate.
>
> The name of the jobs kind of says what they do, and the javadoc says a
> little more. If you have specific questions I bet people can explain
> here.
>
> Sean
>
>
> On Wed, Jun 6, 2012 at 7:39 AM, Something Something
> <mailinglists19@gmail.com> wrote:
> > Hello,
> >
> > I am running this job with a file containing 791,732,411  lines.
> >
> > Step 1 (PreparePreferenceMatrixJob-ItemIDIndexMapper-Reducer)  completed
> in
> > 3 minutes.
> >
> > Step 2 (PreparePreferenceMatrixJob-ToItemPrefsMapper-Reducer) took 2
> hours
> > but completed successfully.  It used only 1 Reducer so I am assuming the
> > output is sorted, right?
> >
> > Step 3 (PreparePreferenceMatrixJob-ToItemVectorsMapper-Reducer) failed
> > after running for 54 minutes with 'Error: Java heap space' error  & it
> was
> > all downhill from there.
> >
> >
> > Question:  Are there any configuration parameters I can use to cut down
> > size of output?  I noticed this in ToItemVectorsMapper:
> >
> > public static final String SAMPLE_SIZE = ToItemVectorsMapper.class +
> > ".sampleSize";
> >
> > How do I cut down this sample size?
> >
> > Also, is there any documentation available that shows what each of these
> > steps does?  If not, I will just debug.  Please let me know.  Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message