hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Burin des Roziers <eric_...@yahoo.com>
Subject put to WAL and scan/get operation concurrency
Date Thu, 05 May 2011 13:03:41 GMT
Hi,

I am currently looking at adding a transactional consistency aspect to HBase and had 2 questions:

1. My understanding is that when the client performs an operation (put, delete, incr), it
is sent to the region server which delegates it to different region servers, which in turn
puts it in the WAL and the MemStore in that region.  At some point later, the MemStore is
flushed to disk (into the HFiles).  The WAL is essentially there as a way to recover the
data in case the machine crashes, hence loosing data stored in its MemCache, but not yet store
on disk.  Once the data is available in the MemStore (but not yet in HFiles), do scans and
gets 'see' that data?  Is the data duplicated in the MemStore across 3 region servers?  If
a region server crashes, can I get into a situation where a scan can return a partial data
set without the client being aware of it?

2. The Hbase-trx package implements transactions by effectively creating a WAL per transaction
(THLog) and 'flushing' it to the main WAL (HLog) on commit.  But, flushing this THLog will
take a time window (however small it is).  If a scan (or get) is performed during that window,
could I get into a situation where I see part of the committed transaction (some rows but
not others since they have not been flushed yet)?  Why did the HBase-trx decide to go with
a THLog, instead of leveraging the KeyValue versioning?

I am thinking of implementing a transaction isolation/consistency mechanism by storing a unique
transaction id as the version when doing a put (instead of the current millis) and passing
invalid transaction ids to scans/get letting them know to fetch a previous version (with a
valid transaction id) for cells that have been updated by a non-committed transaction.  Are
there any reasons for not going with this approach?

Thanks for your help,
-Eric

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message