mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paritosh ranjan <paritoshranj...@gmail.com>
Subject Re: Heap Space Problem while running in cluster in map reduce
Date Thu, 04 Oct 2012 09:52:26 GMT
How many initial clusters are you providing to KMeans?
Try reducing the initial number of clusters and find out the breaking
point. A good way would be to find initial number of clusters from Canopy
Clustering.https://cwiki.apache.org/MAHOUT/canopy-clustering.html

Have you analyzed the nodes of the cluster, whether they are using 16 GB of
RAM or not? If not, then the hadoop cluster configuration would need some
reconfiguration so that it can use most of the available RAM.

On Thu, Oct 4, 2012 at 12:44 AM, syed kather <in.abdul@gmail.com> wrote:

> Team,
>   When i am trying to run KMean clustering i had found it is throwing
>   Java heap space
>          at
>
> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
>          at
>
> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
>          at
>
> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
>          at
> org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:118)
>          at
>
> org.apache.mahout.clustering.ClusterObservations.readFields(ClusterObservations.java:59)
>          at
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>          at
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>          at
>
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>          at
>
> org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:163)
>          at
>
> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
>          at
>
> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:25)
>          at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>          at
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1502)
>          at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2768)
>         at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
> Can i know what may be the reason.
>
>     I have 5 Node cluster
>     Master      4 Core with 16GB RAM
>     salve1       4 Core with  8GB RAM
>     salve2       4 Core with  8GB RAM
>     salve3       4 Core with  8GB RAM
>     salve4       4 Core with  8GB RAM
>
> Let me know if there is any optimization is required for this
>
> Advance Thanks
>             Thanks and Regards,
>         S SYED ABDUL KATHER
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message