hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Performance test results
Date Tue, 03 May 2011 20:29:30 GMT
On Tue, May 3, 2011 at 6:20 AM, Eran Kutner <eran@gigya.com> wrote:
> Flushing, at least when I try it now, long after I stopped writing, doesn't
> seem to have any effect.

Bummer.

>
> In my log I see this:
> 2011-05-03 08:57:55,384 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB,
> free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811, hits=75769916,
> hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
> cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
> evictedPerRun=6949.0791015625
>
> and every 30 seconds or so something like this:
> 2011-05-03 08:58:07,900 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 436.92 MB of total=3.63 GB
> 2011-05-03 08:58:07,947 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68 GB,
> memory=3.69 KB
>
> Now, if the entire working set I'm reading is 100MB in size, why would it
> have to evict 436MB just to get it filled back in 30 seconds?

I was about to ask the same question... from what I can tell from the
this log, it seems that your working dataset is much larger than 3GB
(the fact that it's evicting means it could be a lot more) and that's
only on that region server.

First reason that comes in mind on why it would be so much bigger is
that you would have uploaded your dataset more than once and since
HBase keeps versions of the data, it could accumulate. That doesn't
explain how it would grow into GBs since by default a family only
keeps 3 versions... unless you set that higher than the default or you
uploaded the same data tens of times within 24 hours and the major
compactions didn't kick in.

In any case, it would be interesting that you:

 - truncate the table
 - re-import the data
 - force a flush
 - wait a bit until the flushes are done (should take 2-3 seconds if
your dataset is really 100MB)
 - do a "hadoop dfs -dus" on the table's directory (should be under/hbase)
 - if the number is way out of whack, review how you are inserting
your data. Either way, please report back.

>
> Also, what is a good value for hfile.block.cache.size (I have it now on .35)
> but with 12.5GB of RAM available for the region servers it seem I should be
> able to get it much higher.

Depends, you also have to account for the MemStores which by default
can use up to 40% of the heap
(hbase.regionserver.global.memstore.upperLimit) leaving currently for
you only 100-40-35=25% of the heap to do stuff like serving requests,
compacting, flushing, etc. It's hard to give a good number for what
should be left to the rest of HBase tho...

Mime
View raw message