hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: modify data distribution in jobconf
Date Mon, 02 Jan 2012 07:26:29 GMT
I'm not sure what you are trying to achieve here.

Hadoop MapReduce works by *trying* to schedule tasks on nodes on which data is 'close', either

We doesn't try to 'start'/'stop' nodes. If that is what you are trying to do, you need to
look for something else.


On Dec 31, 2011, at 11:29 PM, mohak gupta wrote:

> hi
> as part of my project I need to modify the data distribution layer in job
> conf so as to achieve the following :
> 1) control which worker nodes should be  started based on the input data
> given to them.
> 2) keep other worker nodes in some kind of sleep state.
> 3) based on the output emitted by the worker nodes and the data distributed
> allow other worker nodes to start .
> 4) Perform this in a looping structure till the output is achieved.
> basically I wish to control which worker nodes perform map and reduce
> functions based on the data they have recieved.
> Could you please help me by suggesting if this could be achieved and also
> what are the tradeoffs involved, Any help is really appreciated
> regards
> Mohak Gupta

View raw message