hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiwei Li <cxm...@gmail.com>
Subject Shuffle phase: fine-grained control of data flow
Date Wed, 07 Nov 2012 06:56:35 GMT
Dear all,

For jobs like Sort, massive amounts of network traffic happen during
shuffle phase. The simple mechanism in Hadoop 1.0.4 to choose reduce nodes
does not help reduce network traffic. If JobTracker is fully aware of
locations of every map output, why not take advantage of this topology

So, is there anyone who knows where to develop such codes upon? Many thanks.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message