hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert James <srobertja...@gmail.com>
Subject Understanding HBase random reads
Date Mon, 04 Jul 2016 13:49:47 GMT
I'd like to understand HBase block reads better.  Assume my HBase
block is 64KB and my HDFS block is 64MB.

I've read that HBase can just do a random read of the 64KB block,
without reading the 64MB HDFS block.  Given that HDFS doesn't support
random reads within a block, how is that possible? Is this only true
if the HDFS block is cached (either mem or disk, but outside of HDFS)?
Or does HBase somehow short circuit and go directly to OS, bypassing
HDFS because it knows HDFS internals?

Depending on the above: Aside from HBase block compression, should I
use HDFS block compression? If HDFS compression prevents HBase from
doing a random read, I most certainly do _not_ want to use it.  But if
HBase can't do a random read to HDFS, then I want to use HDFS block
compression, because you can compress a 64 MB block much better than a
64 KB block.

View raw message