You may also want to consider Parquet (http://parquet.io). It is pretty efficient http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/

-- Ankur Chauhan