hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shuja Rehman <shujamug...@gmail.com>
Subject Re: Logical Division of Hbase Cluster
Date Tue, 14 Jun 2011 19:53:26 GMT
On Tue, Jun 14, 2011 at 11:49 PM, Stack <stack@duboce.net> wrote:

> On Tue, Jun 14, 2011 at 1:41 AM, Shuja Rehman <shujamughal@gmail.com>
> wrote:
> > Well...There are couple of reasons
> >
> > 1- The data is coming from different regions of country and i want to
> > distribute the data w.r.t regions. e.g
> > RegionServer1-RegsionServer4  contain east region data only.
> > RegionServer2-RegionServer6 contain  west region data only.
> >
> Can you do this with a table per region?  Otherwise, prefix the key w/
> region.  This won't be perfect in that the boundary won't be clean but
> perhaps sufficient?
> hum...i think table per region will not work as in future, there will be
data coming from different countries and if i use this strategy then it
means i need to create lot of tables for this which does not seem suitable
to me. I also think to prefix the key with region but i have many other
things in the key also e.g timestamp, tags and i am not sure how hbase
distribute the data to region servers in the presence of these things in the

> > 2- The cluster is combination of different machines w.r.t hardware (RAM,
> > Processor Speed,Number of Cores). Some tables are access frequently and
> some
> > access for fewer time so i want to place the most accessed tables on the
> > machines with highest RAM and processing speeds. e.g create table1,
> colFam1
> > @,, (list of region servers)
> >
> In general, a heterogeneous cluster is probably going to cause you
> headache; rare has hbase run on a cluster that was not homogeneous so
> my guess is that you'll run into 'interesting' issues.
> Currently the levers are not exposed for manually balancing the
> cluster.   Our balancer *should* do this for you factoring in the
> machine resources but currently it does not.
> One thing you could do is turn the balancer off and do the balancing
> yourself externally.  You can move regions either via the shell or
> script.
>  ok,i will look java api to figure out how to move region.
> > 4- I need to implement different priority scanning so the highest
> priority
> > query should be serve through good machines and this can be done if i
> able
> > to place the priority data on good machines. e.g if time= busy hours then
> > place data at good region servers.else if time=night then place data at
> > normal servers.
> >
> >
> HBase will never let you do this.  It won't scale.
> St.Ack

Shuja-ur-Rehman Baig

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message