hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Chen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations
Date Fri, 25 Jan 2013 07:27:13 GMT
Jerry Chen created MAPREDUCE-4961:
-------------------------------------

             Summary: Map reduce running local should also go through ShuffleConsumerPlugin
for enabling different MergeManager implementations
                 Key: MAPREDUCE-4961
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
    Affects Versions: trunk
            Reporter: Jerry Chen


MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 extends Shuffle
to be able to provide different MergeManager implementations. 

While using these pluggable features, I find that when a map reduce is running locally, a
RawKeyValueIterator was returned directly from a static call of Merge.merge, which break the
assumption that the Shuffle may provide different merge methods although there is no copy
phase for this situation.

The use case is when I am implementating a hash-based MergeManager, we don't need sort in
map side, while when running the map reduce locally, the hash-based MergeManager will have
no chance to be used as it goes directly to Merger.merge. This makes the pluggable Shuffle
and MergeManager incomplete.

So we need to move the code calling Merger.merge from Reduce Task to ShuffleConsumerPlugin
implementation, so that the Suffle implementation can decide how to do the merge and return
corresponding iterator.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message