hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Fix Number of Regions per Node ?
Date Mon, 22 Jun 2015 18:11:44 GMT
This issue started to poke its head when companies started to adopt Hadoop. 

In terms of managing it… pre CM, Ambari, you had to manage your own class of nodes and sets
of configuration files. 

Ambari is supposed to be able to handle multiple configurations by now. (If not… then they
are all a bunch of slackers because they’ve had a year to fix it!!! :-P ) 

Does HBase look at the RS as if it were a container and then manage the workload / workflow
based on what that specific container can do? 
Probably not and there are a couple of ways of looking at this… 

1) HBase is outside of YARN.  (Forget slider / or whatever they are calling hoya these days.
You set up a certain amount of resources for HBase and then you leave the rest to YARN. 

This means that regardless of the changes in architecture, you should get the same performance,
or roughly the same performance. 

2) Retiring Hardware.  Moore’s law == 18 months in a generation.  So within 2 generations
you have 3 years which tend to be the limits on warranties. Assuming that you have managers
that want to squeeze in a third generation, that’s 4.5 years which means your kit should
be put out to pasture and replaced. 

This doesn’t really change because once the hardware is out of warranty, it dies, you’re
pretty much screwed and need to replace it anyways. 

The point is that you should be able to keep 1-2 generational hardware configs working in
the same cluster. 

3) Upgrades. 
You may have limits on CPU, but you should be able to upgrade your memory, NIC cards, drives,
etc … so that you could extend the lives of the older hardware to reach that 4.5 year cycle.

This would/should be cheaper than a complete upgrade. 

So if you have multiple hardware configurations. Tune for HBase and let Yarn worry about the
size of the containers for other (M/R) to run.  

Think of it this way… I have different sized pizza boxes. If my pizza is cut pretty much
the same size and that size fits in all of the boxes, I’m ok. 
If I want a larger sized pizza, but I can’t fit it in to all of the boxes.. then you can
always remove those boxes and not use them. 

Your pizza is homogenous… your box size is not. 

Does that make sense? 

> On Jun 17, 2015, at 5:27 PM, rahul malviya <malviyarahul2001@gmail.com> wrote:
> The heterogenity factor of my cluster is increasing every time we upgrade
> and its really hard to keep the same hardware config at every node.
> Handling this at configuration level will solve my problem.
> Is this problem not faced by anyone else ?
> Rahul
> On Wed, Jun 17, 2015 at 5:22 PM, anil gupta <anilgupta84@gmail.com> wrote:
>> Hi Rahul,
>> I dont think, there is anything like that.
>> But, you can effectively do that by setting Region size. However, if
>> hardware configuration varies across the cluster, then this property would
>> not be helpful because AFAIK, region size can be set on table basis
>> only(not on node basis). It would be best to avoid having diff in hardware
>> in cluster machines.
>> Thanks,
>> Anil Gupta
>> On Wed, Jun 17, 2015 at 5:12 PM, rahul malviya <malviyarahul2001@gmail.com
>> wrote:
>>> Hi,
>>> Is it possible to configure HBase to have only fix number of regions per
>>> node per table in hbase. For example node1 serves 2 regions, node2
>> serves 3
>>> regions etc for any table created ?
>>> Thanks,
>>> Rahul
>> --
>> Thanks & Regards,
>> Anil Gupta

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

View raw message