Can't do variable block size in vanilla hadoop. That is part of the whole namenode legacy. On Tue, Aug 28, 2012 at 2:56 AM, Min Zhou wrote: > 1. If it's one data file for each column, data locality is difficult to > guarantee when rebuilding a row from column files. Unless > that GFS can keep all fields from the same row in files of the > same node. Moreover that, data block can't be a fixed > size like 1MB/64MB/128MB, cuz >