hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: zookeeper & HBase
Date Fri, 09 Jul 2010 16:19:48 GMT

On 07/08/2010 06:45 PM, Jean-Daniel Cryans wrote:
> It's not IO intense, it's IO latency sensitive eg. if other processes
> are sucking up most of the IO bandwidth then ZK will have a hard time
> taking quorum decisions.

ZK disk activity is pretty low - really the only time we write to disk 
is when a client asks us to (in this case typically HBase RS). We need 
to "fsync" data to disk before returning "success" indication to the 
client, this is required for our durability guarantees.

The issue here is that if you are on a saturated disk (say colo'd with a 
data node/rs) the disk might be 100+% saturated for minutes at a time. 
That coupled with linux/ext3fs fsync issues it might be minutes before 
our fsync call returns - in which case the ZooKeeper clients (hbase rs) 
will timeout, similar to a network partition or service failure.


> On Thu, Jul 8, 2010 at 5:38 PM, Arun Ramakrishnan
> <aramakrishnan@languageweaver.com>  wrote:
>> Good to know ZK is IO intense.
>> Since ZK does not require much disk space and is IO intense. Has anyone played with
using solid state drives for ZK.
>> We have a 20 node cluster. It would be feasible to have a 3 node ZK all configured
with solid state drives.
>> Thanks
>> Arun
>> -----Original Message-----
>> From: Jonathan Gray [mailto:jgray@facebook.com]
>> Sent: Thursday, July 08, 2010 4:25 PM
>> To: user@hbase.apache.org
>> Subject: RE: zookeeper&  HBase
>> ZK is sensitive to IO starvation which is why it is recommended to keep it on a separate
node or separate disk.  In most cases, giving ZK its own disk is sufficient and dedicated
node(s) are unnecessary.
>> On smallish clusters like 10 nodes, I would recommend starting with just 1 ZK node
co-located with your NameNode and HMaster, but with a dedicated disk just for ZK.  Since the
NN is a SPOF, having one ZK doesn't really lower your fault tolerance, except that it may
be on a non-raided disk.  I encourage RAID usage for NN and ZK.  JBOD for DN/RS.
>> JG
>>> -----Original Message-----
>>> From: vramanathan00@aol.com [mailto:vramanathan00@aol.com]
>>> Sent: Thursday, July 08, 2010 4:20 PM
>>> To: user@hbase.apache.org
>>> Subject: zookeeper&  HBase
>>>   I'm trying to have our deployment layout..I read one of the
>>> articles/FAQ (probably JG's)...that it's better to
>>> have zookeeper on separate cluster/separate sets of machine..I'm
>>> assuming that is the right approach..
>>> All our transactions are HBase (inserts, mapreduce-table as input,
>>> another table as output, other queries,..)
>>> Based on other thread on locality..RegionServer&  Datanode i'll put on
>>> same hosts..
>>> If these boxes have enough capacity, do we need to put zookeeper on
>>> separate cluster?
>>> If it is on a separate cluster, my understanding is zookeper has much
>>> smaller memory footprint compared
>>> to HRegionServer/Datanodes..&  it shld need that much CPU as
>>> well..correct?
>>> Is there any suggested guidance on number of zookeeper vs number of
>>> regionservers?..looking for some ratio..say 10 node cluster..
>>> how many zookeeper..?
>>> Please ignore responding to this ..if this is outside the etiquette
>>> thanks
>>> venkatesh

View raw message