hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: put to WAL and scan/get operation concurrency
Date Thu, 05 May 2011 20:05:05 GMT
On Thu, May 5, 2011 at 11:43 AM, Eric Burin des Roziers
<eric_bdr@yahoo.com> wrote:
> Hi Jean-Daniel,
> Yes, I need to have a multi-row transactional aware HBase for the types of processing
I need to do.  I need to avoid having partial rows available and I am in the process of selecting
a way to implement such a transaction isolation.  I currently have 2 choices: (1) use the
HBase-trx or (2) implement my own leveraging the verioning that HBase provides.  In light
of this I wanted to understand the inner workings of HBase a little more.

You have read the megastore and percolator papers?  They discuss x-row

> For example, I want to understand if scans read data from the MemStore even if it has
not yet been flushed to the HFiles yet.

It does.

> HBase replicates the data 3 times (depending on your configs).  Does it do that as well
for the MemStore.

The data in memstore is first put in the WAL which is replicated three times.

> Say the client wants to inserts 10 lines which happen to fall across 2 regions.  If
region 2 fails, then another client will still be able to read the rows inserted in region
1, but not region 2.  Since HBase replicates data to other servers, region 2 lines could
be available on other servers, right?

Would suggest you read the bigtable paper.  It'll answer most of your
questions more eloquently than I can (To answer your question, only
one region serves a specific piece of data.  It depends on your
transaction implementation as to whether the half written data is
readable by the client).

> The second aspect that I would like to understand is the implementation of the HBase-trx.
 It seems that I can still have a failure point when the transactional WAL (THLog) flushed
the data to the main Wal.  using the above example, I can get into a situation where I will
only be able to read a subset of the initial 10 lines initially inserted.  Is that right?

I think, pardon me if I'm reading this wrong, you have begun on a
wrong foot so your question doesn't add up right.


View raw message