[Gridmix] Improve the way job monitor maintains running jobs
------------------------------------------------------------
Key: MAPREDUCE-3769
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3769
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: contrib/gridmix
Affects Versions: 0.24.0
Reporter: Amar Kamat
Fix For: 0.23.1, 0.24.0
Gridmix maintains a list (L) of running jobs via {{JobMonitor}}. As soon as a job is submitted,
a handle for that job is cached inside the {{JobMonitor}}. The {{JobMonitor}} does the following
in a thread:
{code}
1. remove the first job in the list, say j
2. if j is complete :
goto #1.
else :
add j to the end of the list L.
sleep for 5 seconds.
goto #1.
{code}
Gridmix STRESS mode logic uses the list L to compute the cluster load. It iterates over map/reduce
progress of each and every job in L to figure out the pending+running task count. We need
to investigate and optimize the {{JobMonitor}} algorithm and make sure that the total number
of completed jobs in L is minimum. The overhead of polling for the map and reduce task progress
of a completed job is pretty high as it incurs an additional (RPC) step of contacting the
JobHistory server.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
|