mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshi, Amit Krishna" <joshi...@wright.edu>
Subject RE: Increase timeout for running PFPGrowth
Date Wed, 24 Oct 2012 02:52:31 GMT
Thanks. I am still looking on how to increase the timeouts for FPGrowth.

________________________________________
From: 戴清灏 [rogerdai16@gmail.com]
Sent: Monday, October 22, 2012 10:53 PM
To: user@mahout.apache.org
Subject: Re: Increase timeout for running PFPGrowth

-g means the number of groups when executing the fp-growth.
it equals with the number of the reduce tasks, so I suggest you using the
same number of your reducer in your cluster.

-k means the cache that will be kept, so it could be larger if you have a
big memory on single node.

在 2012年10月23日星期二,Matt Molek 写道:

> Did you have those spaces "-D mapred.task.timeout=18000000"? That
> won't be parsed correctly. It should be:
> "-Dmapred.task.timeout=18000000"
>
> On Mon, Oct 22, 2012 at 1:08 PM, Amit Krishna Joshi <joshi.35@wright.edu<javascript:;>>
> wrote:
> > Hi,
> >
> > I am running PFP on several datasets and it works well for smaller ones
> (<
> > 5GB)
> > However, for the larger ones, I keep getting following timeout message.
> >
> > Task attempt_201210140938_0105_r_000000_0 failed to report status for 600
> > seconds. Killing!
> >
> > Is there a way I can increase the timeout?
> >
> > I even tried passing these parameter but in vain:
> > -D mapred.task.timeout=18000000 -D mapred.child.java.opts=-Xmx4000m
> >
> > My input params are:  -s 10000 -g 1000  -tc 8  -k 50 -method mapreduce
> >
> > Also, please suggest what would be the optimum value of g and k.
> > Number of features > million
> >
> >
> > Thanks,
> > Amit
>


--
Regards,
Q
Mime
View raw message