hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johannes Zillmann (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1859) maxConcurrentMapTask & maxConcurrentReduceTask per job
Date Fri, 11 Jun 2010 11:11:14 GMT
maxConcurrentMapTask & maxConcurrentReduceTask per job

                 Key: MAPREDUCE-1859
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1859
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
          Components: job submission
    Affects Versions: 0.20.2
            Reporter: Johannes Zillmann

It would be valuable if one could specify the max number of map/reduce slots which should
be used for a given job. An example would be an map-reduce job importing from a database where
you don't want 50 map tasks querying one db at a time but also you don't want to shrink the
overall map task count.
Also this is probably already possible through Fair/Capacity-Scheduler or an own Extension
i think it would be a good addition for the default TaskScheduler since this seems to be more
then a rare used feature.
This would have the benefit in situations where you don't have control/ownership over the
cluster as well. 
And its more job-centric whereas the existing scheduler extensions seems to be more job-type-centric.

Implementing this feature should be relatively straightforward. Adding something like jobConf.setMaxConcurrentMapTask(int)
and respecting this configuration in JobQueueTaskScheduler.

Not sure if this feature would be harmonical with the existing Fair/Capacity-Schedulers.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message