hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: Clarification regarding HBase reads
Date Mon, 21 Feb 2011 07:07:13 GMT
Hi Hari,

Thanks for bringing up that this is not entirely clear. Just to add to
what was said here already, this is all happening when a table is
enabled. This also happens when you restart HBase for example. So as
soon as you have a table open and you can see it in the UI then the
system has created all these objects for you. By the time you query
there are no extra resources opened or added, HBase has the files
open, the block index loaded and in memory. The actual objects are not
that much of a problem, it is mainly the issue Ted points out and also
file handles plus connection handlers (the xcievers). There is also
work going on (but is stalling and restarting few times already) to
make the DFSClient using Channels and NIO, doing away with the
resources requirements (while taking a slight performances hit).

This is an interesting topic though and there is a lot to be done it seems.


On Sun, Feb 20, 2011 at 3:38 PM, Hari Sreekumar
<hsreekumar@clickable.com> wrote:
> Hi,
> I was going through the HBase architecture blog by Lars George (
> http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html) and
> I just wanted a clarification regarding how HBase reads data. The blog
> mentions that :
> Next the HRegionServer opens the region it creates a corresponding
> HRegion object.
> When the HRegion is "opened" it sets up a Store instance for each
> HColumnFamily for every table as defined by the user beforehand. Each of
> the Store instances can in turn have one or more StoreFile instances, which
> are lightweight wrappers around the actual storage file called HFile. A
> HRegion also has a MemStore and a HLog instance. We will now have a look at
> how they work together but also where there are exceptions to the rule.
> Does this mean that a store instance is opened for all tables present in
> HBase irrespective of which table we are querying and for all
> columnfamilies? Is this why I generally see people avoiding large number of
> tables/large number of column families. If not, what is the reason for that?
> Is it true at all that we should avoid too many tables/CFs ?
> Thanks,
> Hari

View raw message