hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Calculations of the InputSplits
Date Sun, 25 Sep 2011 19:58:12 GMT
Hello Praveen,

That is a valid point. Besides, it can even be a task that computes
the splits (Safer this way, instead of running _inside_ the

Lets continue the discussion on
https://issues.apache.org/jira/browse/MAPREDUCE-207 which seems very
relevant to this.

On Sun, Sep 25, 2011 at 10:12 PM, Praveen Sripati
<praveensripati@gmail.com> wrote:
> Hi,
> There was a query in StackOverflow regarding high CPU on the client after
> submitting jobs (upto 200 jobs in batch and 150MB jar file size).
> Calculation of the InputSplit may be one of the reason for the high CPU on
> the client. Why should the calculation of the InputSplit happen on the
> client? JobTracker is a high-end machine, can't the calculation happen on
> the JobTracker?
> http://stackoverflow.com/questions/7546064/hadoop-high-cpu-load-on-client-side-after-committing-jobs
> Thanks,
> Praveen

Harsh J

View raw message