hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Logical Division of Hbase Cluster
Date Tue, 14 Jun 2011 18:49:56 GMT
On Tue, Jun 14, 2011 at 1:41 AM, Shuja Rehman <shujamughal@gmail.com> wrote:
> Well...There are couple of reasons
> 1- The data is coming from different regions of country and i want to
> distribute the data w.r.t regions. e.g
> RegionServer1-RegsionServer4  contain east region data only.
> RegionServer2-RegionServer6 contain  west region data only.

Can you do this with a table per region?  Otherwise, prefix the key w/
region.  This won't be perfect in that the boundary won't be clean but
perhaps sufficient?

> 2- The cluster is combination of different machines w.r.t hardware (RAM,
> Processor Speed,Number of Cores). Some tables are access frequently and some
> access for fewer time so i want to place the most accessed tables on the
> machines with highest RAM and processing speeds. e.g create table1, colFam1
> @,, (list of region servers)

In general, a heterogeneous cluster is probably going to cause you
headache; rare has hbase run on a cluster that was not homogeneous so
my guess is that you'll run into 'interesting' issues.

Currently the levers are not exposed for manually balancing the
cluster.   Our balancer *should* do this for you factoring in the
machine resources but currently it does not.

One thing you could do is turn the balancer off and do the balancing
yourself externally.  You can move regions either via the shell or

> 4- I need to implement different priority scanning so the highest priority
> query should be serve through good machines and this can be done if i able
> to place the priority data on good machines. e.g if time= busy hours then
> place data at good region servers.else if time=night then place data at
> normal servers.

HBase will never let you do this.  It won't scale.


View raw message