mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jia Rao <>
Subject Problem running twenty newsgroup example in a hadoop cluster
Date Tue, 25 Jan 2011 02:54:45 GMT
Hi all,

I am having a problem running the 20 newsgroup example in a hadoop cluster.
The trainclassifier worked fine but I got "out of memory java heap" problem
in the testclassifier.

The following is the configuration of the hadoop cluster.

Physical machines: 4 nodes, each with 6GB memory.

Hadoop: 0.20.2, HADOOP_HEAP_SIZE=3200 in, in mapred-site.xml.

mahout: tried release 0.4 and the latest source, same problem.

Command line arguments used:

$MAHOUT_HOME/bin/mahout testclassifier \
  -m newsmodel \
  -d 20news-input \
  -type bayes \
  -ng 3 \
  -source hdfs \
  -method mapreduce

Any ideas ?
Thanks !

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message