mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Schlaikjer <>
Subject Re: LDA/CVB Performance
Date Thu, 13 Jun 2013 16:43:03 GMT
Hi Alan,

On Thu, Jun 13, 2013 at 8:54 AM, Alan Gardner <> wrote:

> The weirdest behaviour I'm seeing is that the multithreaded training Map
> task only utilizes one core on an eight core node. I'm not sure if this is
> configurable in the JVM parameters or the job config. In the meantime I've
> set the input split very small, so that I can run 8 parallel 1-thread
> training mappers per node. Should I be configuring this differently?

At my office it's generally frowned upon to run MR tasks which attempt to
make use of lots of cores on a multicore system, due to cluster
configuration which forces number of map / reduce slots to sum to num
cores. If multiple multi-threaded task attempts run on the same node, CPU
load may spike and negatively affect performance of all task attempts on
the node.

> I also wanted to check in and verify that the performance I'm seeing is
> typical:
> - on a six-node cluster (48 map slots, 8 cores per node) running full tilt,
> each iteration takes about 7 hours. I assume the problem is just that our
> cluster is far too small, and that the performance will scale if I make the
> splits even smaller and distribute the job across more nodes.

How many input splits are generated for your input doc-term matrix? In each
task attempt, how many rows are processed? Make sure input is balanced
across all map tasks.

> - with an 8GB heap size I can't exceed about 200 topics before running out
> of heap space. I tried making the Map input smaller, but that didn't seem
> to help. Can someone describe how memory usage scales per mapper in terms
> of topics, documents and terms?

The tasks need memory proportional to num topics x num terms. Do you have a
full 8 GB heap for each task slot?


Twitter, Inc.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message