mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sengupta, Sohini IN BLR SISL" <sohini.sengu...@siemens.com>
Subject OutOfMemoryError: Java heap space
Date Mon, 28 Mar 2011 10:05:57 GMT
Hi Folks,

I get following error when I execute Meanshift.job() on a 3 gb data on a hadoop cluster with
7 nodes:

11/03/28 14:31:59 INFO mapred.JobClient: Counters: 8
11/03/28 14:31:59 INFO mapred.JobClient:   Job Counters
11/03/28 14:31:59 INFO mapred.JobClient:     Rack-local map tasks=20
11/03/28 14:31:59 INFO mapred.JobClient:     Launched map tasks=94
11/03/28 14:31:59 INFO mapred.JobClient:     Data-local map tasks=74
11/03/28 14:31:59 INFO mapred.JobClient:   FileSystemCounters
11/03/28 14:31:59 INFO mapred.JobClient:     HDFS_BYTES_READ=5248279611
11/03/28 14:31:59 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=123791732509
11/03/28 14:31:59 INFO mapred.JobClient:   Map-Reduce Framework
11/03/28 14:31:59 INFO mapred.JobClient:     Map input records=58830
11/03/28 14:31:59 INFO mapred.JobClient:     Spilled Records=0
11/03/28 14:31:59 INFO mapred.JobClient:     Map output records=2524062
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
        at org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
        at org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
        at org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:164)
        at org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
        at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
        at org.apache.mahout.utils.clustering.ClusterDumper.readPoints(ClusterDumper.java:280)
        at org.apache.mahout.utils.clustering.ClusterDumper.init(ClusterDumper.java:218)
        at org.apache.mahout.utils.clustering.ClusterDumper.<init>(ClusterDumper.java:95)
        at org.apache.mahout.clustering.syntheticcontrol.meanshift.Job.run(Job.java:143)
        at org.apache.mahout.clustering.syntheticcontrol.meanshift.Job.main(Job.java:56)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

The hadoop heap size is configured as "HADOOP_HEAPSIZE=2000" in hadoop-env.sh
And :

  <name>mapred.child.java.opts</name>
  <value>-Xmx2000m</value>
Please help I consistently run into outofmemory error. What else should I change?

Thanks a lot in advance,
Sohini



________________________________
Important notice: This e-mail and any attachment there to contains corporate proprietary information.
If you have received it by mistake, please notify us immediately by reply e-mail and delete
this e-mail and its attachments from your system.
Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message