Hi all,
Tajo leverages the feature supported by HDFS-3672, which exposes the disk
volume id of each hdfs data block. I already found the related code in
DefaultTaskScheduler.assignToLeafTasks, can anyone explain the logic for
me? What the scheduler do when the hdfs read is a remote read on the other
machine's disk?
Thanks,
Min
--
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.
My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com
|