mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philippe Lamarche" <philippe.lamar...@gmail.com>
Subject Re: Memory problems with KMeans
Date Thu, 13 Nov 2008 18:00:02 GMT
In hadoop-env.sh, i am using :

#The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=1900

I removed the GC parameters, tryed it again, and got the same error.



On Thu, Nov 13, 2008 at 12:29 PM, Sean Owen <srowen@gmail.com> wrote:

> You're not setting -Xmx1024m, for example, to increase the max heap
> size (unless somehow these other options imply that.) Try that?
>
> In general I'd say don't bother messing too much with the GC
> parameters unless you're sure they're necessary. Especially in Java 6.
> For instance I don't think you want 4 GC threads?
>
> (I also throw on -da and -dsa to disable assertions when I care about
> speed.)
>
> On Thu, Nov 13, 2008 at 4:53 PM, Philippe Lamarche
> <philippe.lamarche@gmail.com> wrote:
> > Hi,
> >
> > I am using KMeans to do some text clustering and I get into memory
> problems.
> > As of now, I only tried it on a laptop in pseudo distributed master/slave
> > mode.
> >
> > This is on Hadoop branch-0.19. The "texttovector.jar" contains a hacked
> > version of the syntheticcontrol KMeans example, the only difference is in
> > the first input phase.
> >
> > Is this memory error "normal"? I am running with export
> HADOOP_OPTS="-server
> > -XX:+UseParallelGC -XX:ParallelGCThreads=4 -XX:NewSize=1G
> -XX:MaxNewSize=1G
> > -XX:-UseGCOverheadLimit"
> >
> > In my understanding, the "-XX:-UseGCOverheadLimit" should remove the
> > GCOverhead "feature".
> >
> > Any ideas?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message