mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: spark-rowsimilarity java.lang.OutOfMemoryError: Java heap space
Date Wed, 13 May 2015 17:27:47 GMT
There is a bug in mahout 0.10.0 that you can fix if you are able to build from source. Get
the source tar for 0.10.0, not the current master.

Got to https://github.com/apache/mahout/blob/mahout-0.10.x/spark/src/main/scala/org/apache/mahout/drivers/TextDelimitedReaderWriter.scala#L157

remove the line that says: interactions.collect()

See this Jira https://issues.apache.org/jira/browse/MAHOUT-1707

There is one other thing that can cause this and is fixed by increasing you client JVM heap
space but try the above first.

BTW setting the executor memory twice, is not necessary.


On May 13, 2015, at 2:21 AM, Xavier Rampino <xrampino@senscritique.com> wrote:

Hello,

I've tried spark-rowsimilarity with out-of-the-box setup (downloaded mahout
distribution and spark, and set up the PATH), and I stumble upon a Java
Heap space error. My input file is ~100MB. It seems the various parameters
I tried to give won't change this. I do :

~/mahout-distribution-0.10.0/bin/mahout spark-rowsimilarity --input
~/query_result.tsv --output ~/work/result -sem 24g
-D:spark.executor.memory=24g

Do I just need to input more memory, or is there another step I can do to
solve this ?


Mime
View raw message