hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Questions about HBase load balancing and HFile
Date Mon, 20 Jan 2014 17:46:02 GMT
For question #4, see also
http://hbase.apache.org/book.html#regions.arch.locality

Cheers


On Sun, Jan 19, 2014 at 10:49 PM, Bharath Vissapragada <
bharathv@cloudera.com> wrote:

> For question #3, The block size Lars talks about is the blocksize inside a
> HFile which is different from HDFS block size. Look at
> http://hbase.apache.org/book/apes03.html . Hfile is indexed as blocks to
> facilitate random access to data so that we can skip unnecessary disk
> blocks while gets/scans. Smaller the hfile block size better is the random
> read performance. You can see the detailed hfile layout in that link.
>
> For question #4, You are correct, since the data resides on HDFS, each
> region server has access to all the storefiles (they just use hdfs api to
> read them). The reason they are still available after a (RS+datanode) crash
> is because of the replication in hdfs. The store files still have valid
> replicas and namenode tries to maintain the replication factor by
> re-replicating them eventually.
>
>
> On Mon, Jan 20, 2014 at 12:08 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > For question #1, there is load balancer in HMaster which does the job of
> > balancing region load.
> >
> > For number 2, the daughter regions stay on the same server as the parent
> > after split. Later one or both of them may be moved to other region
> servers.
> >
> > Cheers
> >
> > On Jan 19, 2014, at 10:27 PM, Bill Q <bill.q.hdp@gmail.com> wrote:
> >
> > > Hi,
> > > I am trying to get more information about HBase. I would appreciate
> some
> > > answers to these few questions. Thanks a lot.
> > >
> > > 1. About load balancing: does HMaster monitor overloaded or low loaded
> > > HRegionServer, and move some regions from the hot HRegionServer to low
> > > loaded ones (with or without add new servers into the cluster,
> > > respectively)?
> > >
> > > 2. About region splitting: when splitting a region, will the newly
> > created
> > > regions stay on the current HRegionSever, or will HMaster assign some
> new
> > > HRegionServers to take the newly created two regions?
> > >
> > > 3. About HFile size: Lars mentioned here
> > >
> http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.htmlthat
> > > the HFile size is default to 64k. How does this work while the default
> > HDFS
> > > block is 64M/128M? Would the small HFile size waste lots of space on
> > HDFS?
> > >
> > > 4. About data locality: if a HRegionServer fails, the HMaster would
> > assign
> > > a new HRegionServer to take its place. But does this new HRegionServer
> > > should have access to the storeFiles? I assumed that's how it works by
> > > using HDFS's data replication. But after some readings, I got confused.
> > It
> > > seems that the new HRegionServer can work without the storeFiles data
> at
> > > local. How does this work at all?
> > >
> > > Many thanks.
> > >
> > >
> > > Bill
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message