hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Schellenberger <leipz...@gmail.com>
Subject RE: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)
Date Sat, 01 Feb 2014 06:31:43 GMT
A lot of useful information here...

I disabled bloom filters
I changed to gz compression (compressed files significantly)

I'm now seeing about *80gets/sec/server* which is a pretty good improvement. 
Since I estimate that the server is capable of about 300-350 hard disk
operations/second, that's about 4 hard disk operations/get.

I will experiment with the BLOCKSIZE next.  Unfortunately upgrading our
system to a newer HBASE/Hadoop is tricky for various IT/regulation reasons
but I'll ask to upgrade.  From what I see, even Cloudera 4.5.0 still comes
with HBase 94.6

I also restarted the regionservers and am now getting
blockCacheHitCachingRatio=51% and blockCacheHitRatio=51%.  
So conceivably, I could be hitting the: 
root index (cache hit)
block index (cache hit)
load on average 2 blocks to get data (cache misses most likely as my total
heap space is 1/7 the compressed dataset)
That would be about 52% cache hit overall and if each data access requires 2
Hard Drive reads (data + checksum) then that would explain my throughput.
It still seems high but probably within the realm of reason.

Does HBase always read a full block (the 64k HFile block, not the HDFS
block) at a time or can it just jump to a particular location within the

View this message in context: http://apache-hbase.679495.n3.nabble.com/Slow-Get-Performance-or-how-many-disk-I-O-does-it-take-for-one-non-cached-read-tp4055545p4055564.html
Sent from the HBase User mailing list archive at Nabble.com.

View raw message