mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xavier Rampino <xramp...@senscritique.com>
Subject Re: spark-rowsimilarity java.lang.OutOfMemoryError: Java heap space
Date Mon, 18 May 2015 13:10:16 GMT
I just did that but I ran into the same problem, I feel like -sem doesn't
work with my setup. For instance I have :

15/05/18 13:44:39 INFO BlockManagerInfo: Removed broadcast_13_piece0 on
localhost:60596 in memory (size: 2.7 KB, free: *1761.1 MB*)

(Maybe it's not related though)

On Wed, May 13, 2015 at 7:27 PM, Pat Ferrel <pat@occamsmachete.com> wrote:

> There is a bug in mahout 0.10.0 that you can fix if you are able to build
> from source. Get the source tar for 0.10.0, not the current master.
>
> Got to
> https://github.com/apache/mahout/blob/mahout-0.10.x/spark/src/main/scala/org/apache/mahout/drivers/TextDelimitedReaderWriter.scala#L157
>
> remove the line that says: interactions.collect()
>
> See this Jira https://issues.apache.org/jira/browse/MAHOUT-1707
>
> There is one other thing that can cause this and is fixed by increasing
> you client JVM heap space but try the above first.
>
> BTW setting the executor memory twice, is not necessary.
>
>
> On May 13, 2015, at 2:21 AM, Xavier Rampino <xrampino@senscritique.com>
> wrote:
>
> Hello,
>
> I've tried spark-rowsimilarity with out-of-the-box setup (downloaded mahout
> distribution and spark, and set up the PATH), and I stumble upon a Java
> Heap space error. My input file is ~100MB. It seems the various parameters
> I tried to give won't change this. I do :
>
> ~/mahout-distribution-0.10.0/bin/mahout spark-rowsimilarity --input
> ~/query_result.tsv --output ~/work/result -sem 24g
> -D:spark.executor.memory=24g
>
> Do I just need to input more memory, or is there another step I can do to
> solve this ?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message