hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: HBase and hadoop cluster rebalance
Date Thu, 16 Oct 2008 20:41:48 GMT
Daniel Ploeg wrote:
> Hi all,
> I performed a cluster rebalance on my test cluster yesterday (5 regionserver
> / datanodes each with approx 400GB - total approx 2TB HDFS) and I would like
> to know if the mailing lists have seen similar results to what I've seen.

I talked to the lads running hbase here at powerset.  They believe they 
have seen something similar when they grow the cluster by some 
significant percentage (20-30%).  The addition of new machines brings on 
a rebalancing and thereafter hbase runs "faster".

> I had a single table with a single column family and loaded it up so that it
> just about filled the entire cluster. Actually one or two of the nodes had
> run out of space, yet the fifth machine only had 50% of its disks utilised
> (which is why I though a rebalance was in order). There are a total of 1475
> regions in the cluster. Prior to starting the rebalance the cluster only had
> about 250GB left to it's disposal. After the rebalance I now have almost
> 800GB free.

If 1475 regions, update to 0.18.1 (coming soon).

> Furthermore, I was performing read tests prior to the rebalance and getting
> a response time of approx 500ms per row (each row has 10000 column instances
> of the column family which were deserialised as part of the test). After the
> rebalance my read times reduced to around 340ms.
If you could have fewer columns in a family column, you'll get a bit 
better performance: HBASE-867.

Good on you Daniel,

View raw message